Bagging ensemble for deep learning based gender recognition using test-time augmentation on large-scale datasets

DANIŞMAN, TANER

doi:10.3906/elk-2008-166

Bagging ensemble for deep learning based gender recognition using test-time augmentation on large-scale datasets

Atıf İçin Kopyala

DANIŞMAN T.

TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, cilt.29, sa.4, ss.2084-2100, 2021 (SCI-Expanded)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 29 Sayı: 4
Basım Tarihi: 2021
Doi Numarası: 10.3906/elk-2008-166
Dergi Adı: TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, Applied Science & Technology Source, Compendex, Computer & Applied Sciences, INSPEC, TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.2084-2100
Anahtar Kelimeler: Cross-dataset gender recognition, bagging methods, deep learning, test-time augmentation, NEURAL-NETWORK, AGE, FEATURES, PATTERN, IMAGES
Akdeniz Üniversitesi Adresli: Evet

Özet

We present a bagging ensemble of convolutional networks in combination with the test-time augmentation technique to improve performance on the cross-dataset gender recognition problem. The bagging ensemble combines the predictions from multiple homogeneous models into the ensemble prediction. Augmentation techniques are often used in the learning phase of the CNNs to improve the generalization ability. On the other hand, test-time augmentation is not a common method used in the testing phase of the learned model. We conducted experiments on models trained using different hyperparameters. We augmented the test data and combine the predictive outputs from these network models. Experiments performed on diverse gender datasets, including Adience, AFAD, CelebA, Gallagher, Genki-4K, IMDb, LFW, Morph, VGGFace2, and Wiki, showed that the use of bagging ensemble of convolutional networks and test-time augmentation outperforms standalone models. We obtained the highest cross-dataset accuracy in the literature on seven out of eleven datasets. For the remaining four datasets we reported the cross-dataset results for the first time. According to our experiments, VGGFace2, IMDb, and CelebA datasets provided the highest cross-dataset classification results for most of the test datasets in the gender recognition problem.