Introduction
As you may know, DINOv2 proccess a powerful ability to extract semantic features from input images. (Please refer to this post). Based on extracted semantic features, I have addressed the flower classification problem.
What I’m going to share can be summarized as follows:
The result comparisons of using the fully connected layer with DINOv2 and Resnet 50 as backbone for classification.
The visualization of TSNE.
DINOv2
Foundation models such as CLIP and DINO have garnered significant attention from researchers. To keep up with recent technological advances, I have studied DINOv2. Briefly, DINOv2 excels in extracting semantic features due to its training with a Self-Supervised Learning (SSL) method. This ability is evident in Figure 1 of the DINOv2 paper, which visualizes the first three components of PCA. In this figure, similar parts of objects are colored the same. For instance, the wings of an eagle and an airplane are green, while their heads are red. These results demonstrate DINOv2’s proficiency in extracting semantic features from images. I have implemented code to reproduce this PCA visualization.