Research on Content Storage Method of Unstructured Geological Data
-
Graphical Abstract
-
Abstract
Geological work has entered the era of big data, yet the unstructured data, such as reports and maps carrying geosciences information, are still classified in simple ways and stored in the file system, forming a lot of data set with complex internal structures. This method cannot well deliver the abundant geosciences information carried by unstructured data or the complex relationships with information, nor can it discover the knowledge deeply existing across data sets. To solve the problem, this paper proposes a multi-granularity level content tree model and a data modeling method that supports evolution. The model can split the data content at different scales and accurately locate the information and meanwhile expand the dimension of the subject's feature description according to the need of the data subject. The information contained in the data is finally discovered and the relationship with information is thus established. This paper designs a persistence method of data model with HBase as the core to achieve the purpose of processing data under the big data technology system. A modeling example shows preferable effect in content organization and information conveying, with the unstructured data of documents and maps split and reconstructed as the smallest unit of the content entity.
-
-