Using photodiodes and supervised machine learning for automatic classification of weld defects in laser welding of thin foils copper-to-steel battery tabs

This paper has been designed to study whether photodiodes and supervised machine learning (ML) algorithms are sufficient to automatically classify weld defects caused by simultaneous variation of the part-to-part gap and laser power during remote laser welding (RLW) of thin foils, with applications in battery tabs. Photodiodes are used as the primary source of data and are collected in real-time during RLW of copper-to-steel thin foils in the lap joint. Experiments are carried out by the nLight Compact 3 kW fiber laser integrated with the Scout-200 2D scanner. The paper reviews and compares seven supervised ML algorithms (namely, k-nearest neighbors, decision tree, random forest, Naïve – Bayes, support vector machine, discriminant analysis, and discrete wavelet transform combined with the neural network) for automatic classification of weld defects. Up to 97% classification rate is obtained for scenarios with simultaneous variations of weld penetration depth and part-to-part gap. The main causes of misclassification are imputed to the interaction between welding parameters (part-to-part gap and laser power) and process instability at high part-to-part gap (high variation in the process not captured by the photodiodes). Arising opportunities for further development based on sensor fusion, integration with real-time multiphysical simulation, and semi-supervised ML are discussed throughout the paper


AI
artificial intelligence D P weld penetration depth (μm) DWT discrete wavelet transform ECOC error-correcting output codes k-NN k-nearest neighbor LoC lack of connection ML machine learning NN neural network OCT optical coherence tomography OP overpenetration P L laser power (W) RLW remote laser welding SW sound weld S P signal generated by the radiation from plasma and metal vapor (V) S T signal generated by the radiation in the short waves-infra red (V) S R signal generated by the reflected laser radiation (V) SVM support vector machine T L lower material thickness (μm) T S throat thickness (μm) T U upper material thickness (μm)

INTRODUCTION
With the transition from fossil fuel to electric mobility, it is estimated that at least 30 million zero-emission electric vehicles (EVs) will be on the roads in the EU alone by 2030. The assembly of a single battery pack for EVs requires about 20 000 cell-to-tab welds, which adds up to a few billion welds being made per year. Cell-to-tab welds involves various dissimilar thin foils ranging from a few tens of micrometers to below 500 μm. Going from a few thousands to billions of welds per year is a game change for current production systems and hence, needs tighter and reliable monitoring, diagnosis, and process control. This is corroborated by the fact that (1) undetected lack of connection results in voltage drops with the malfunction of the whole battery pack; (2) undetected variations in the weld profile result in unequal electrical resistances within the same battery pack leading to uneven current loads that can reduce the overall electrochemical performance of the battery pack and lead to inhomogeneous cell degradation; 1 (3) excessive weld penetration depth brings the risk of piercing adjacent components (electrodes, etc.) with subsequent leakages of harmful gases and potential thermal-runway.
Remote laser welding (RLW) is the process of choice for a wide range of EV applications and, in particular, for cell-to-tab assembly due to several advantages in terms of single sided noncontact access, high production rate due to high processing speed, narrow heat affected zone, and ease of automation. 2,3 A number of novel laser welding technologies have been developed and proposed in recent years and hold the promise to stabilize the molten pool and, hence, enlarge the process window. For example, novel laser systems (i.e., green or blue lasers 4 ) and/or beam shaping technologies (i.e., adjustable ring mode laser 5 or optical phased array 6 ) claim improved stability of the keyhole and better coupling of the laser beam. Although some of these technologies are being implemented in the industry, the fact remains that the weld quality is still below the expected targets. In this regard, sensor technologies for in-process monitoring of laser weldments have attracted significant interest. For example, Simonds et al. 7 investigated the laserinduced fluorescence (LIF) for monitoring laser spot welding of aluminum to copper thin-foil, with the purpose of controlling the formation of intermetallic compounds showing that it enables the detection of copper atoms in the vapor plume before sufficient laser energy was deposited to form strong mechanical joints. Chianese et al. 1 investigated the capability of photodiodes to monitor weld penetration depth and part-to-part gap variations during RLW of 300 μm-thick copper-to-steel foils and concluded that the occurrence of weld defects is indicated by the signal features, such as energy intensity and scatter level.
Boosted by current rises in digital technologies such as machine learning (ML) and artificial intelligence (AI), the concept of intelligent systems for automatic classification of weld defects is now a closer perspective. The underlined principle for the automatic classification of laser weld defects via supervised ML techniques is to generate both defective and nondefective parts (also known as classes), while gathering a set of data/signals via in-process sensors. Data are represented with distinctive features (see Fig. 1) and then the classification algorithm is trained on those features to draw a "decision boundary" so that, when a new and untrained case is presented, the algorithm would be able to automatically assign it to the most similar class. The concept of "similarity" is pivotal and differentiates the selection of the algorithm. Cases with poor similarity lead to the problem of misclassifications-i.e., multiple decision boundaries can be drawn in the feature space as a consequence of the fact that the same set of features describes multiple classes. Misclassifications must be avoided since they trigger false negative (type-I error) and/or false positive (type-II error) scenarios and have detrimental effects on production up-time, scrap rate, and product quality.
Classification algorithms have been predominantly implemented for laser welding of thick parts (above 1 mm), whose process window is larger and more robust against process variations than assembly with thin foils (below 500 μm). 8 Nonetheless, the application of classification algorithms to the RLW process of thin foils for battery cell-to-tab welding remains an under-explored area of research. Lee et al. 9 combined photodiodes with SVM, fully connected neural network (FCN), and convolutional neural network (CNN) to estimate the level of weld penetration during laser welding of aluminum to copper for thin foils and introduced three classes with respect to the weld penetration mode (penetration limited to the upper foil, penetration of the weld in the lower foil, and transition mode). Sumesh et al. 10 classified metal-arc welding experiments with respect to three classes by training decision trees and random forest algorithms with statistical features that were calculated from signals recorded with microphones. Wang et al. 11 employed high speed photography and SVM to predict weld quality during welding of steel plates with respect to two classes, namely, good and poor welds. Lee et al. developed in situ monitoring of CO 2 laser welding of 0.83 mm-thick galvanized steel using the spectrometer by training k-NN and SVM models with the ranked features based on the spectroscopic and temporal information of the spectra. 12 Motivated by the fact that photodiodes have a simple structure and low cost, this paper aims at studying whether photodiodes combined with supervised ML models are sufficient to classify weld defects caused by simultaneous variation of the part-to-part gap and laser power during RLW of thin foils battery tabs. The paper will review and compare 7 ML methods for weld defects classification by introducing three classes: lack of connection, sound weld, and overpenetration weld.

Equipment and experimental settings
The employed laser unit was nLight Compact fiber laser 3 kW (nLight Inc., USA), and the laser beam was delivered by a 2D scanner (Scout-200, Laser & Control K-lab, South Korea). The photodiode-based sensor LWM 4.0 (Laser Welding Monitoring, Precitec GmbH, Germany) was used to record optical emission with wavelength within the following three ranges: 300-700, 1200-2000, and 1020-1090 nm, respectively, in the S P signal (plasma), S T signal (temperature), and S R signal (back reflection) at a maximum sampling rate of 50 kHz. The sensor was aligned to the center of the molten pool/keyhole and was installed just below the collimator of the scanner, close to the camera port [see Fig. 2 Tables I and II. Laser beam wobbling was implemented and, the laser beam motion consisted of the superimposition of a circular oscillation (500 Hz and radius equal to 0.2 mm) and a linear motion with a speed of 120 mm/s. The laser power (P L ) was delivered in a continuous mode and the direction of the laser beam was perpendicular to the specimens (this was enabled via the F-theta optics). The position of the focal point was set 500 μm above the lower surface of the steel foil. Experiments were performed without filler wire and shielding gas. A "check surface" was placed below the steel plate so that a mark on the check surface would indicate full penetration of the foils by the laser beam [see Fig. 2

Design of experiments and generation of datasets
Two process parameters were considered in the design of experiments, namely, the laser power and part-to-part gap. The laser power was varied to emulate manufacturing scenarios with variable weld penetration depth and eventually variations of material absorptivity. Variation of part-to-part gap relates to nonrepeatable clamping system and dimensional inaccuracies. Part-to-part gap was controlled by using shim packs. Specimens were 70 mm long and 30 mm wide and were wiped with acetone prior to welding to remove any surface contamination. All welds were in lap configuration with a welding length of 40 mm. Three replications for each experiment were executed in randomized order to avoid unknown bias effects. For each experiment, one cross section was taken at the middle of the weld-each cross section was ground and polished (no etching) and, after mounting in resin disks, images were recorded by the microscope Nikon Eclipse LV150N.
Three datasets were generated (Table III) with the aim to classify weld defects caused by only variations of the laser power (dataset A); only variations of the part-to-part gap (dataset B); and simultaneous variations of laser power and part-to-part gap (dataset C).
Three geometrical features were measured (see Fig. 3) in each cross section: (1) weld penetration depth, D P ; (2) throat thickness, T S ; (3) and actual part-to-part gap. T s was measured at the shortest distance of the weld profile from the bottom corner of the upper material. Those features were then used to label the welds in three classes.
Definition of the three classes derived from the need to meet the safety requirements (no laser piercing through the bottom foil) and satisfy the electrical and mechanical targets (via control of T S and D P ).
• Class (1) The reasoning behind the selection of the labeling criteria is discussed as follows: (1) the overpenetration owns the risk of piercing adjacent components and thermal-runway. Looking at Figs. 3(b)-3(c), it appears that both cases (b) and (c) represent fullpenetration welds (molten layer fully extended throughout the two foils). However, the case in (c) has a blind keyhole, which does not propagate throughout the bottom foil. As such, the laser radiation [shown as small arrows in Figs. 3(b) and 3(c)] eventually is only absorbed by the keyhole walls (or back-reflected towards the top) and does not pierce through the bottom of the steel foil. Following this logic, only case (b) is labeled as overpenetration; (2) minimum level (35% of T L ) of weld penetration depth to ensure mechanical resistance; (3) minimum level (75% of T U ) of throat thickness to ensure both electrical conductivity and mechanical resistance.

Signal processing and definition of signal features
The photodiode recorded three signals, S P , S T , and S R , during each experiment. Both hardware and software gains were set to clamp the signals in the range [0, 10] V.
Since for each experiment, one cross section was taken at the middle of the weld, the signals were extracted likewise. For this purpose, a cropping window of 6 mm (corresponding to a duration of 0.05 s and approximate 2500 readings) was used and centered in the middle of the recorded signal. The 6 mm window was chosen to allow ±3 mm tolerance during the cutting stage of the cross section itself. This approach allowed consistency between the results of the cross sections and data collected by the sensor.
Statistical features were then extracted from the cropped signal: mean value, μ, and the scatter level, σ. The mean value is proportional to the total energy content of the emitted radiation; the scatter level is proportional to the uncontrolled process variations and was calculated with the standard deviation of the noise content at frequencies above 100 Hz. Therefore, each data point in the dataset had a six-tuple of signal features, {μ P , σ P , μ T , σ T , μ R , σ R }. Figure 4 illustrates the 2D case (μ P and σ P ) of the feature space for the three analyzed datasets. Figure 4(a) shows that the variation of the weld penetration depth (dataset A) results in the gradual transition from lack of connection to overpenetration. This transition has a negative effect on the capability to classify the weld defects-this point will be discussed thoroughly in the "Results" section. Opposed to dataset A, in dataset B [ Fig. 4(b)], the occurrence of weld defects is indicated by the abrupt changes of the signals, which results in two distinct classes (lack of connection and sound weld). Dataset C in Fig. 4(c) shows the simultaneous variation of part-to-part gap and laser power; this leads again to a gradual transition with overlapped regions.
Although dataset B has limited dimensionality (only 14 data points) and, therefore, is not suitable alone for ML models, it has been used in conjunction with datasets A and C (A ∪ B ∪ C) to prove the generalization of the selected ML models.

ML models for classification of weld defects
The paper benchmarks seven ML models 8,10-13 for the classification of weld defects.
(1) k-NN-a data point is classified by a plurality vote of its "k" neighbors in the feature space, with the data point being assigned to the class most common among its nearest neighbors. (2) Decision tree-classification is determined by binary decisions at different nodal levels, with observation being assigned to one of the two branches based on attribute values. Final decision results in assignation to a class at the last level, which is also called "leaf." (3) Random forest-as individual decision trees tend to overfit, random forests that consist of an ensemble of decision trees are used to prevent overfitting. (4) Naïve-Bayes-it is a probabilistic classifier with boundaries between classes that are defined in the space of the observed attributes by leveraging the Bayes theorem and assuming that the features are conditionally independent; coupling with kernel density estimation, it enables the achievement of higher accuracy. Since SVM uses a binary classifier and this paper deals with three classes, we considered the error-correcting output codes (ECOCs) model instead. It is worth noting that these methods are optimized to work with the low number of features (below 100). Parameters and kernels for each of the considered ML models are reported in Table IV. The paper also introduces the DWT to gather additional features beyond the statistical ones. DWT allows representing the signal in the time-frequency domain. 8 Opposed to fast Fourier transform (FFT), which is capable to represent only stationary signals in the frequency domain, DWT deals with nonstationary signals and provides band frequency information in the time domain. It also allows coping with local spikes, discontinuities, and fluctuations. The interest in exploring the potentiality of DWT stems from the idea of accounting for the frequency content of the oscillations/fluctuations in the signals.
Our hypothesis is that those fluctuations are directly related to the dynamics of the keyhole and the molten pool. For instance, during the RLW process, the keyhole is kept open according to a pressure equilibrium whose balance is influenced by the process parameters and their variations, hence leading to oscillations/fluctuations in the signal itself. 14 DWT starts by passing the signal iteratively through digital low pass filters with the impulse response, called scaling function, and through high pass filters, the wavelet function. The result of these iterations is an ordered subset of N s coefficients that can be arranged in a sparse matrix and allow a nonredundant representation of the signal with perfect reconstruction upon inversion. Coefficients corresponding to high frequency-bands are close to zero and hence have low contribution toward the reconstruction of the original signal. 8,14 Following this logic, the representation of the signal with N s /2, N s /4, and N s /8 have progressively lower dimensionality but carry less information, as shown in Fig. 5. In our implementation, the set of coefficients generated via DWT are
• Number of neighbors: tested both 2 and 3. With k > 3 accuracy degraded. • Standardization of values of the predictors.
(2) Decision tree • Algorithm: classification and decision tree (CART) with Gini diversity index split criterion. We trained the ML model (1)-(6) with the six statistical features, {μ P , σ P , μ T , σ T , μ R , σ R }. Conversely, NN was trained using the coefficients (1000+) of DWT. It is worth noting that NN can process inputs with high dimensionality. 15 NN algorithms have been used for the classification of weld defects, and artificial neural network is the basis for a number of different type of NN algorithms, with deep neural networks (DNNs) and CNN being the most promising algorithms 9,13,16 . This paper implemented a fully connected input layer (whose nodes are the DWT coefficients) to the output layer (whose nodes are the three classes), with a bias vector and without hidden layers. The softmax function was then used for normalization. To avoid any randomness during the training, the weights of the fully connected layer were initialized to zero.
The composition of the dataset can affect the performance of the trained model. When classes are not equally represented by the observations, the dataset is defined as imbalanced. Several techniques are used to handle imbalance classification problems, which involve undersampling of the majority class, oversampling of the minority class, applying cost functions, or synthetic data generation/augmentation. 17 In this paper, only dataset C resulted imbalanced since we performed more experiments at higher part-to-part gaps to account process instability at higher power. The imbalance of the dataset was addressed by data augmentation via a linear combination of signals belonging to the same minority class. Augmented signals were generated by considering two signals within the same class. Each element of the new augmented signal was calculated as the average of the elements in the same position as the original signals. The class composition of dataset C before and after class balancing is reported in percentages in Table V. Accuracy of the ML models was evaluated using leave-one-out crossvalidation. Implementation was carried out in MATLAB© and both datasets and source codes are available from the Zenodo portal (https://zenodo.org/record/6732794#.YrnEb3ZBzIU).

Metallographic analysis
Characterization and metallographic analysis of datasets A and B have been already addressed in Ref. 1. Hence, this paper only reviews the results of dataset C. Experimental campaign consisted of 86 experiments ranging across lack of connection to overpenetration, as shown with cross sections in Fig. 6(a). It is worth noting that the transition (at no part-to-part gap) from a condition of sound weld to overpenetration is indicated by the change of the geometric shape of the weld seam from conical to cylindrical, reflecting that the laser pierces the steel foils and marks the check surface. Figures 6(b)-6(c) show the average and the standard deviation of the weld penetration depth, as measured in the cross sections. As expected, weld penetration depth increases with increasing laser power and decreases with increasing part-to-part gap. This trend results in overpenetration in those experiments with laser power P L = [840, 990] W.
Conversely, a lack of connection was observed at P L = [390, 540] W. Higher values of standard deviation are observed when gap ≥ 150 μm, especially for P L = [840, 990] W. This is due to the fact that gravity prevails on viscous stresses and surface tension, and hence, the molten copper sinks into the part-to-part gap,   generating an unstable molten pool. 18 This instability leads to a significant lack of process repeatability (when gap ≥ 150 μm, the standard deviation is approximately 125 μm, which is more than half of the copper thickness) and consequently to the coexistence of the two classes (lack of connection and overpenetration) for the same experiment. This is further illustrated in Fig. 7.

Characterization of signals
Mean value and scatter level have been calculated to characterize signals S P , S T , and S R to simultaneous variations of laser power and part-to-part gap.
Regression models of the six statistical features were contourplotted against laser power and part-to-part gap (Fig. 8). A two-way ANOVA analysis was implemented to test the statistical significance of the process parameters against the signals-the significance level set at 5%. ANOVA tests the null hypothesis that the process parameter has no impact on the signal feature, against the alternative hypothesis that its variation has a significant impact and is reflected by the variation in the signal itself. Results of the two-way ANOVA are given in Table VI. If p-value is lower than 5% than the alternative hypothesis is accepted. Results are discussed as follows: • Plasma and temperature signals (S P and S T )-the p-value far below 5% suggests that the variation of both laser power and part-to-part gap against the variation of both plasma and temperature is statistically significant. This translates to the fact that variations in S P and S T are well represented by the variations in S P and S T . Furthermore, looking at the regressions in Figs. 8(a)-8(d), it appears a positive correlation with the laser power and a negative correlation with the part-to-part gap. This is justified by the fact that the higher the power the higher the plasma plume formation and equally the thermal field. Besides, with an increasing gap, the higher amount of emitted radiation is dispersed between the two foils; hence, process radiation decreases with the increasing gap. • Back-reflection signal (S R )-variations of laser power (p-value = 7.8%) and part-to-part gap (p-value = 100%) do not statistically describe the variations observed in the back-reflection.

Classification of weld defects
The performances of the selected ML models were evaluated in terms of accuracy of classification, and the results are given in Table VII. The accuracy is expressed in percentage and represents the ratio between the number of true predictions and the total number of data points in the dataset. Low accuracy corresponds to the high level of misclassification. To test the generalization, the classification models were also tested against the combined dataset A ∪ B ∪ C. It is worth noting that the coating of the copper and the thickness between dataset A/B and C are different (see Table III).
Main findings are articulated as follows:

Discussion about the accuracy
The accuracy of the ML models trained with dataset A is significantly affected by the relatively small size of the dataset (46 experiments) and by the overlaps in the features space with the cluster of experiments belonging to different classes [see Fig. 4(a)]. Gradual variation of weld penetration depth from lack of connection to overpenetration determines smooth transition in the signal features-this leads to overlapped regions. The highest classification accuracy was achieved by the random forest (87%) and the second highest by the DWT&NN (84.8%). Class balancing applied to dataset C via data augmentation has a positive impact, even if the interaction between simultaneous variations of part-to-part gap  and weld penetration depth determines the overlap of signals with different labels. DWT&NN has overall the best performance with 97.5% accuracy on dataset C. This result can be explained considering that the DWT coefficients carry also information about the frequency content of the signal, enabling better performances than the statistical descriptors, mean and scatter level, only limited to the time domain-statistical descriptors are, therefore, more susceptible to confusion due to overlap of experiments with different labels in the signal features space.

Discussion about the generalization
When trained with the mixed dataset (A ∪ B ∪ C), all algorithms perform relatively worse than dataset C alone (which carries most of the data points compared to A and B). This is explained considering that experimental conditions vary during the three different experimental campaigns. Indeed, copper foils used in datasets A and B are 300 μm-thick, whereas those used for dataset C are 200 μm-thick and Ni-plated. Eventually, differences in the physical phenomena are not reflected in the signals that do not carry sufficient information with reliable patterns for classification. Results show that the DWT&NN is the algorithm that best generalizes with 92.7% accuracy. Higher accuracy and capability to generalize the case of simultaneous variations of the part-to-part gap and laser power can improve performances in the diagnosis and classification of weld defects.

Discussion about the misclassification
Analysis of the misclassifications shows that there are two main sources of confusion: (1) significant variation in the welding process itself as a consequence of the nature of the materials being welded (i.e., high reflective materials such as copper) or induced by part-to-part gaps-as observed in Fig. 6(c) when gap ≥ 150 μm, the standard deviation of the weld penetration depth is approximately 125 μm, which is more than half of the copper thickness; those variations are eventually not captured by the signal features; (2) the interaction between process parameters which may lead to similar weld profiles but different signal features, but also the concurrent occurrence of overpenetration and lack of connection-this results in overlapped regions as shown in Fig. 6(d) and demonstrated in Fig. 9.
Since photodiodes only measure the radiations emitted during the process but not directly the weld features, the results indicate that passive observation via photodiodes provides useful information for the classification but does not lead to an exhaustive indication of the process status. For instance, the occurrence of shrinkages (reduction in the throat thickness) in the upper foil might not have a significant effect on the radiation and, therefore, they cannot be captured by the signal features. Since the common consensus is to achieve classification accuracy as close as possible to 100% to avoid both false negative (type-I error) and false positive (type-II error) scenarios, current limitations call for future improvements. Actions for improvements are discussed as follows: • Sensor fusion-Integration of photodiodes with other types of sensors, such as OCT (for the direct measurement of the weld penetration depth) and vision systems/laser scanners (for the direct measurement of the seam top surface and throat thickness), can enable gathering data with complementary information. In this regard, sensor fusion might be a viable option to overcome the limitations of individual sensors; however, cost and maintenance issues must be considered too.  • Integration with multiphysical simulation-subsurface weld features (i.e., weld pores), which are not currently detectable by sensor technology might be predicted via multiphysical simulation. Recent advancements in computational power and fidelity of the multiphysical simulations 19 are opening new opportunities to develop digital-twin models which leverage both sensor and simulation data. • Further developments in machine learning-main findings also showed that the performance of the classification algorithms improves with increasing the size of the training dataset. Therefore, the availability of large datasets is of paramount importance. However, the collection of large datasets with both defective and nondefective parts is unavoidably expensive (i.e., defects need to be made on purpose) and time-consuming. Furthermore, data labeling is a manual process and prone to errors. Semi-supervised ML approaches, which would rely on a mixed dataset of both "labeled" and "unlabeled" data, could be investigated to significantly reduce the cost and time of data collection and labeling.

CONCLUSIONS
This paper investigated the use of photodiodes and ML classification models for the automatic classification of weld defects during RLW of copper-to-steel thin foils. Three classes were introduced (lack of connection, sound weld, and overpenetration), driven by safety, electrical, and mechanical requirements arising during battery cell-to-tab assembly. Seven supervised ML models were implemented and benchmarked.
The key findings are discussed as follows: • Plasma and temperature are the predominant signals which carry most of the information about the process and are well correlated to both variations of the part-to-part gap and laser power. Back-reflection signal showed weak contribution. • Since the process window is significantly narrow when welding thin foils (below 500 μm) and the interaction between parameters generates process instability (when gap ≥ 150 μm at high power the standard deviation of the weld penetration depth is approximately 125 μm, which is more than half of the copper thickness), variations in the process are eventually not captured by the signal features, resulting in misclassifications. • Up to 97.5% classification accuracy is achieved for scenarios with simultaneous variations of weld penetration depth and part-to-part gap. • Photodiodes integrated with ML models provide useful information for the classification but do not lead to an exhaustive indication of the process's status. Opportunities for sensor fusion and integration with real-time multiphysical simulation have been highlighted as a future stream of research and development.