机器学习在金矿床成因类型识别中的应用：基于黄铁矿微量元素地球化学特征的探索研究

樊松浩; 王达; 朱国韬; 杨彪; 苏攀云; 侯秀宏; 吕汉秦; 韩燕东; 陈磊

doi:10.12401/j.nwg.2025187

机器学习在金矿床成因类型识别中的应用：基于黄铁矿微量元素地球化学特征的探索研究

Application of Machine Learning in the Genetic Classification of Gold Deposits: An Exploratory Study Based on Pyrite Trace Element Geochemistry

摘要

摘要: 黄铁矿是金矿床中最常见的金属硫化物，其微量元素组成能够有效记录成矿流体性质与成矿环境信息，在金矿床成因类型判别中具有重要指示意义。随着信息技术的迅速发展，地球科学研究已进入数据密集化阶段，多源、多尺度地球化学数据的积累为矿床类型的定量识别提供了新的契机。然而，现有研究仍存在样本规模有限、矿床类型覆盖不足及判别精度不高等问题。针对上述不足，笔者系统汇集铁氧化物–铜–金（IOCG）型、火山成因块状硫化物（VMS）型、斑岩型、低硫浅成低温热液型、高硫浅成低温热液型、卡林型及造山型7种类型金矿床的黄铁矿LA-ICP-MS微量元素数据并构建大数据集。在完成缺失值控制与中心对数比（CLR）变换的基础上，引入支持向量机（SVM）与随机森林（RF）模型开展矿床类型判别，并通过交叉验证与多指标评价对模型性能进行系统对比。结果显示，基于原始CLR特征空间的SVM模型在分类精度与稳定性方面表现最优；RF模型在保持较高判别能力的同时，结合SHAP分析揭示了不同矿床类型判别所依赖的关键微量元素组合特征。研究表明，机器学习方法能够有效提取具有地质意义的成矿指示信息，将高性能分类模型与可解释性分析相结合，是开展复杂成矿体系中金矿床类型识别与成因研究的有效技术路径。

Abstract: Pyrite is the most common metallic sulfide in gold deposits, and its trace element composition effectively records the physicochemical properties of ore-forming fluids and mineralization environments, thus serving as an important indicator for discriminating genetic types of gold deposits. With the rapid development of information technology, geoscience research has entered a data-intensive era, and the accumulation of multi-source, multi-scale geochemical data provides new opportunities for quantitative classification of deposit types. However, existing studies are still limited by small sample sizes, insufficient coverage of deposit types, and relatively low classification accuracy. To address these issues, this study systematically compiles LA-ICP-MS trace element data of pyrite from seven types of gold deposits, including iron oxide–copper–gold (IOCG), volcanogenic massive sulfide (VMS), porphyry, low-sulfidation and high-sulfidation epithermal, Carlin-type, and orogenic deposits, and constructs a large geochemical dataset. After missing value control and centered log-ratio (CLR) transformation, Support Vector Machine (SVM) and Random Forest (RF) models are applied to classify deposit types, and their performance is systematically evaluated using cross-validation and multiple metrics. The results show that the SVM model based on the original CLR feature space achieves the best performance in terms of classification accuracy and stability, whereas the RF model, while maintaining strong predictive capability, reveals the key trace element combinations controlling classification through SHAP analysis. Overall, this study demonstrates that machine learning methods can effectively extract geologically meaningful mineralization signatures, and that the integration of high-performance classification models with interpretability analysis provides an effective approach for the identification and genetic study of gold deposit types in complex mineralization systems.

HTML全文

参考文献(88)

施引文献

资源附件(0)