top of page

Manufacturing Defect Detection Using Unsupervised Learning

According to the American Society of Quality, many firms spend up to 40% of their overall production revenue on quality-related expenses. The inefficiency of manual inspection, which is the most popular method of providing quality control in manufacturing, accounts for a sizable portion of this expense. A more efficient and precise method of performing a visual inspection in production lines is presented by the application of artificial intelligence for quality control automation.

However, there are several restrictions on how we can train and apply models for defect detection due to standard machine-learning techniques. Therefore, we'll go over the benefits of unsupervised learning for defect identification in this post and comment on the various methodologies.

What is AI defect detection and where it is used?

The ability to automate the entire AI quality inspection process using machine learning algorithms is provided by AI defect detection, which is based on computer vision. Defect detection models are taught to visually inspect products as they move down the manufacturing line, spot irregularities on their surface, and identify variations in their dimensions, shape, or color. Depending on what the model is trained to accomplish, the output may vary.

Defect detection AI is effective at inspecting extensive production lines and identifying flaws even on the smallest bits of a finished product when used in quality control operations. This applies to a broad range of produced goods that could have surface flaws of various kinds.

The use of computer vision to automate tyre quality inspection is described by Intel. According to the report, the production line saved over $49,000 in labor expenditures while improving quality control accuracy from 90% to 99%. However, such systems are not restricted to fixed factory hardware. Drones equipped with cameras, for instance, can be used to evaluate flaws in the pavement or other exterior surfaces, greatly reducing the amount of time needed to cover a major metropolis.

The pharmacy sector gains from inspecting production lines for various goods. For instance, Orobix uses a particular kind of camera that can be used by an inexperienced human operator to apply fault detection to drug manufacturing.

The same methodology is used to examine pharmaceutical glass flaws, such as cracks and air bubbles lodged in the glass. The food industry, textiles, electronics, heavy manufacturing, and other sectors all provide examples of this. However, there are certain special issues with the way typical machine learning approaches defect detection techniques. Thousands of goods are inspected by manufacturers every day, making it challenging to label and collect sample data for training. Unsupervised learning is useful in this situation.

Unsupervised learning: What is it?

Challenges in acquiring a significant amount of aberrant data A relatively little discrepancy between a normal and anomalous sample is possible. There is a large discrepancy between the two abnormal samples. Not being able to predict the kind and the number of anomalies Supervised machine learning techniques are used in the vast majority of applications. By manually classifying the data that has been acquired, supervised learning requires that we give the model ground truth data. Collecting and labeling data for a production line can be problematic because there is no way to collect every variation of cracks or dents on a product to guarantee correct detection by the model. There are four issues here:

  • Challenges in acquiring a significant amount of aberrant data.

  • A relatively little discrepancy between a normal and anomalous sample is possible.

  • There is a large discrepancy between the two abnormal samples.

  • Not being able to predict the kind and number of anomalies.

Unsupervised machine learning algorithms enable you to identify patterns in a data collection without prelabeled outcomes and identify the underlying structure of the data in situations where it is hard to train the algorithm conventionally. Contrary to supervised learning, the training process requires less work since we anticipate that the model will find patterns in the data with a larger tolerance for variation.

Anomaly detection reveals previously unseen rare objects or events without any prior knowledge about them. The dataset's limited percentage of anomalies is the only information that is currently accessible. This helps to solve the issue with data labeling and sample collection in terms of defect detection. So let's examine how unsupervised learning techniques can be applied to the development of defect detection models.

What role does unsupervised learning play in the discovery of defects?

Anomaly detection in machine learning is related to the problem of defect detection. Even if we don't rely on labeling, there are various unsupervised learning techniques that seek to organize data and give the model pointers.

  • Clustering is the process of grouping unlabeled examples based on similarities. Recommendation engines, market or customer segmentation, social network analysis, or search result clustering all frequently use clustering.

  • The goal of association mining is to identify recurring correlations, relationships, or patterns in databases.

  • Latent variable models are created to use latent variables to model the distribution probability. It is primarily used for preparing data, reducing the number of features in a dataset, or breaking a dataset up into different parts based on features.

Traditional machine learning models can be implemented using patterns found through unsupervised learning. As an illustration, we may cluster the data that is available and use the resulting clusters as a training set for supervised learning models.

Unsupervised Machine Learning for Concrete Crack Detection

An experiment was ran utilizing the Concrete Crack dataset and has extensive machine-learning experience. Unsupervised learning was used to develop a model that could distinguish between photos with flaws and those that weren't. The study also examines the impact of the quantity of defective photos on specific algorithms employed in this project.

It was presumed that image labels cannot be known in advance during training for the use case we choose. Since training takes place using an unsupervised approach, only a test dataset is labeled to assess the accuracy of the model's predictions. To obtain classification results from an unsupervised learning model, five distinct strategies were tested in this case.


Clustering is used to organize unlabeled examples because there isn't any labeled ground truth data available. In our situation, we must choose two image clusters from the dataset. For feature extraction and clustering, a pre-trained VGG16 convolutional neural network was used, along with K-means. According to their visual similarity, photos with and without cracks were categorized in this case using clustering. Essentially, clustering seems like this. Clustering techniques are simple to use and frequently used as a starting point for more deep-learning modeling.

Birch Clustering

Using this method, features were extracted from photos using a pre-trained ResNet50 neural network and the Birch clustering algorithm. The cluster centroids are read from the leaf as this technique builds a tree data structure. It is an online learning algorithm that is memory efficient.

Custom Convolutional Autoencoder

Encoder and decoder are the two building parts of a custom convolutional autoencoder. In order to recreate images from them in the decoder section, features are obtained in the encoder part. We must choose an alternative method to obtain classes because we lack labels for network training, such as an adaptively configurable threshold. An adaptively configurable threshold is used to precisely separate two distributions (crack-free photos and images with cracks).


DCGAN uses adversarial loss to produce images from z-space (BCALoss). Three losses remain: the generator loss, discriminator loss, and MSE loss (to compare generated images and ground truth). With the aid of an adaptive adjustable threshold, we may compare the losses on images with cracks and without cracks in order to create our classification using the same methodology as in bespoke autoencoders. Depending on their distributions, discriminator loss or MSE loss should be used for the threshold.


The conditional GAN method is used by GANomaly to train a generator to create images of the normal data. When an aberrant image is passed during inference, it is unable to accurately capture the data. It results in good reconstruction for normal images and bad reconstruction for images with defects, and it calculates the anomaly score.

In Conclusion

The ability to skip collecting and classifying massive volumes of sample data is perhaps the most advantageous aspect of unsupervised learning techniques. We are not restricted to which model can be utilized for real classification and fault detection when using unsupervised learning approaches to derive data patterns.

However, since it's challenging to assess the model's predictive accuracy, particularly without a labeled dataset, unsupervised learning models are better suited for categorizing already-existing data into classes. To gain access to more of our whitepapers, visit here.

20 views0 comments

Recent Posts

See All


bottom of page