Abstract: During the production of an automobile, various fluids such as steering fluid, brake fluid, radiator coolant, etc. that are required for the operation of an automobile, are filled to the vehicle via a specific process. Any problems such as leakage in the fluids systems should be identified during the filling process and necessary corrections must be made to the automobile before it goes forward in the production line. The fluid filling process consists of vacuuming step followed by filling step. This paper provides results of our research on the brake fluid system quality based on the sensor data, which is recorded during filling process. The filling dataset contains two time series data corresponding to the vacuuming and filling steps. First, we use this raw data to construct a dataset with 1-faulty/0-Not-faulty labels. Later we use this dataset to construct machine learning models, with classical methods, and convolutional neural network models. Results show that gradient boosting methods are better with the current settings, and we have improvement opportunities related to convolutional neural network architectures.
https://doi.org/10.1109/ASYU48272.2019.8946399
1 of 22
Download to read offline
More Related Content
Data Analysis for Automobile Brake Fluid Fill Process Leakage Detection using Machine Learning Methods
1. 1/21
K端rat 聴nce
HAVELSAN A..
Yakup Gen巽
Gebze Teknik niversitesi
Data Analysis for Automobile
Brake Fluid Fill Process Leakage
Detection using Machine Learning
Methods
2. 2/21
Agenda
Leakage Detection in Automobiles
Background Information
Sensor Data and Dataset Preparation
Methods and Evaluation Metrics
Results and Discussion
Conclusion
3. 3/21
Leakage Detection in Automobiles
Filling Station: Brake fluid, power steering fluid, and coolant.
4. 4/21
Filling Process
Vacuum
Vacuum pressure drops to 3 mbars in about 50-60 seconds.
Wait 5 secs.
Pressure should not increase more than 0.5 mbars.
Fill
Filling fluid is pumped into the system. 600-700 ml of brake fluid, or
7-8 l. of coolant.
The operator does some fixing at the stations.
If not fixable, the car is moved to the exit.
5. 5/21
Some Background
Renault & Nissan, Alliance Vehicle Evaluation Standard (AVES), 2001
R. S. Peres, et.al, Multistage quality control using machine learning in the
automotive industry, IEEE Access, vol. 7, pp. 7990879916, 2019
K. Chen, et.al., Prediction of weld bead geometry of mag welding based on
XGBoost algorithm, The International Journal of Advanced Manufacturing
Technology, vol. 101, pp. 22832295, Apr 2019.
E. Ard脹巽 and Y. Gen巽, Classification of 1D signals using Deep Neural Networks,
in 2018 26th Signal Processing and Communications Applications Conference
(SIU), May 2018.
6. 6/21
Discussion for the Background
Quality control using machine learning methods on factory floor is an active
research area.
Gradient boosting and random forest provides better results than any other
classical machine learning methods.
Deep Neural Networks (CNN, etc.) are getting attention by the researchers.
K. Chen et.al.: XGBoost based models are more interpretable than other black
box models, such as CNNs.
7. 7/21
Sensor Data
Time series data:
operation: Type of operation (either vacuum or fill)
time: Timestamp for the sensor reading
machineID: Filling machine identifier
chassis: Automobile chassis number
vacuumpressure: Pressure sensor reading during vacuuming
fillamount: Volume of the filling fluid
fillpresure: Pressure sensor reading during filling
12.151.666 readings at 2 Hz.
51250 unique car chassis.
10. 10/21
Dataset Preparation
Time between consecutive cycles is greater than 2000
milliseconds.
For each cycle data, extract vacuumpressure, fillamount,
and fillpressure, and a label.
If more cycles are observed for the same chasis in the
future, label that cycle as 1 (leakage/failed), otherwise label
it as 0 (successful).
Vectorize each cycle into 200 readings for vacuumpressure,
130 readings for fillamount, and 130 readings for fill
pressure, constructing a 460 feature vector for each cycle.
The resulting dataset has 53.254 samples, with 51.250
negative (%96.23) and 2004 (%3.77) positive samples.
11. 11/21
Machine Learning Methods
Random Forest Classifier
Ensemble learning method
Random selection of features
Implementation: Scikit-learn
Gradient Boosting Classifier
Ensemble of weak prediction models
Fit pseudo-residuals
Implementations: XGBoost, and CatBoost
Gaussian Process Classifier
Based on stochastic process
Non-parametric Expressive
Implementation: Scikit-learn
12. 12/21
Convolutional Neural Networks
Convolutional Neural Networks:
Deep learning architecture
Convolutional layers followed by pooling layers
Extract features from raw data.
Implemetation: Keras deep learning library on top of Theano library
13. 13/21
Evaluation Metrics
Accuracy (ACC): In an imbalanced dataset, using accuracy
as a sole evaluation metric can be misleading.
The default classification, i.e. predicting each sample as 0
(successful) results 96.23% accuracy.
Area Under the ROC Curve (AUC)
14. 14/21
Evaluation Metrics continued
Matthews Correlation Coefficient (MCC): More informative
than confusion matrix measures, i.e. TP, TN, FP, FN
We report ACC, but watch for AUC, and MCC.
15. 15/21
Experimentation
Data preparation
Use grid search to optimize model parameters
Evaluate with 5-fold stratified cross validation:
Train model
Evaluate model
Report evaluation average
19. 19/21
Results and Discussion
XGBoost and CatBoost show 5%-7% better performance
CatBoost is slightly better (~0.3%).
Although Gaussian process is a powerful technique, its
efficiency and run time performance is debatable.
O(n3) run time performance and memory requirement
CNN architectures
Current architectures are not deep enough to perform as good as
gradient boosting classifiers.
20. 20/21
Conclusion
Time series data from automobile industry
Binary classification using random forest, gradient boosting,
and Gaussian process classifiers, and CNN deep learning
architectures.
Gradient boosting methods give promising results.
CNN did not learn to extract useful features:
Quality of the time series data.
Number of samples
21. 21/21
Future Work
Increase performance of DL methods.
Data augmentation
Deeper networks
LSTM models
CNN and LSTM in the same model