Application of Machine Learning in the Genetic Classification of Gold Deposits: An Exploratory Study Based on Pyrite Trace Element Geochemistry
-
Abstract
Pyrite is the most common metallic sulfide in gold deposits, and its trace element composition effectively records the physicochemical properties of ore-forming fluids and mineralization environments, thus serving as an important indicator for discriminating genetic types of gold deposits. With the rapid development of information technology, geoscience research has entered a data-intensive era, and the accumulation of multi-source, multi-scale geochemical data provides new opportunities for quantitative classification of deposit types. However, existing studies are still limited by small sample sizes, insufficient coverage of deposit types, and relatively low classification accuracy. To address these issues, this study systematically compiles LA-ICP-MS trace element data of pyrite from seven types of gold deposits, including iron oxide–copper–gold (IOCG), volcanogenic massive sulfide (VMS), porphyry, low-sulfidation and high-sulfidation epithermal, Carlin-type, and orogenic deposits, and constructs a large geochemical dataset. After missing value control and centered log-ratio (CLR) transformation, Support Vector Machine (SVM) and Random Forest (RF) models are applied to classify deposit types, and their performance is systematically evaluated using cross-validation and multiple metrics. The results show that the SVM model based on the original CLR feature space achieves the best performance in terms of classification accuracy and stability, whereas the RF model, while maintaining strong predictive capability, reveals the key trace element combinations controlling classification through SHAP analysis. Overall, this study demonstrates that machine learning methods can effectively extract geologically meaningful mineralization signatures, and that the integration of high-performance classification models with interpretability analysis provides an effective approach for the identification and genetic study of gold deposit types in complex mineralization systems.
-
-