Meta-learning analysis of deep neural network architectures on diverse numeric datasets via geometric complexity descriptors
Tarih
Yazarlar
Dergi Başlığı
Dergi ISSN
Cilt Başlığı
Yayıncı
Erişim Hakkı
Özet
Meta-learning techniques aim to predict the most suitable learning algorithm for a given dataset based on its intrinsic structural characteristics. These techniques provide a robust framework for understanding algorithmic behavior across diverse data dis tributions and attributes. Although these state-of-the-art models (CNNs and transformers) are widely applied in various machine learning tasks, their use on numerical datasets remains underexplored due to the complexity of their internal structures. This study aims not only to predict the performance of two black-box deep learning models on static datasets but also to conduct a behavioral analysis in order to identify which meta-features most strongly infuence their outcomes. It seems unclear which specifc attributes of a dataset positively or negatively afect the performance of these deep learning models. To bridge this gap, we constructed a meta dataset consisting of 296 datasets, each characterized by 20 meta-features describing the dataset’s statistical, geometric, and structural properties. The analysis identifes which intrinsic dataset properties infuence model accuracy, without relying on raw data or hyperparameter tuning. Results show that both models perform best on datasets with high feature discriminability, as captured by meta-features such as maximum feature efciency, collective feature efciency, and directional separability. In contrast, performance declines with increasing class boundary complexity and nonlinearity, refected in features like class separability measures and the linear classifer nonlinearity metric. While CNNs are more sensitive to local geometric complexity, transformers respond more strongly to global statistical measures such as mutual information and entropy, highlighting their distinct inductive biases. The proposed meta-model accurately predicts the performance of both architectures on unseen datasets (0.96 correlation coefcient, 0.019 MAE, and 0.025 RMSE for CNNs; 0.92 correlation coefcient, 0.027 MAE, and 0.036 RMSE for transformers), enabling performance estimation without costly training. These fndings emphasize the importance of aligning model architecture with dataset geometry and structure. Additionally, the framework supports more interpretable, efcient, and sustainable deep learning model selection in structured data settings.












