地理学报 ›› 2019, Vol. 74 ›› Issue (3): 586-598.doi: 10.11821/dlxb201903014

所属专题: 地理大数据

• 地理大数据 • 上一篇    下一篇

地理大数据挖掘的本质

裴韬1,2(), 刘亚溪1,2, 郭思慧1,2, 舒华1,2, 杜云艳1,2, 马廷1,2, 周成虎1,2   

  1. 1. 中国科学院地理科学与资源研究所 资源与环境信息系统国家重点实验室,北京 100101
    2. 中国科学院大学,北京 100049
  • 收稿日期:2018-10-08 修回日期:2019-02-15 出版日期:2019-03-25 发布日期:2019-03-19
  • 作者简介:

    裴韬(1972-), 男, 研究员, 博士生导师, 主要从事地理大数据挖掘研究。E-mail: peit@lreis.ac.cn

  • 基金资助:
    国家自然科学基金项目(41525004, 41421001);National Natural Science Foundation of China, No.41525004, No.41421001

Principle of big geodata mining

Tao PEI1,2(), Yaxi LIU1,2, Sihui GUO1,2, Hua SHU1,2, Yunyan DU1,2, Ting MA1,2, Chenghu ZHOU1,2   

  1. 1. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2018-10-08 Revised:2019-02-15 Online:2019-03-25 Published:2019-03-19

摘要:

针对地理大数据的内在本质以及地理大数据挖掘对于地理学研究的意义,本文解释了地理大数据的含义,并在大数据“5V”特征的基础上提出了粒度、广度、密度、偏度和精度等“5度”的特征,揭示了地理大数据的本质特点。在此基础上,从地理大数据的表达方式、地理大数据挖掘的目标、地理模式的叠加与尺度性、地理大数据挖掘与地理学的关系等4个方面阐述了地理大数据挖掘的本质与作用,并从挖掘目标的角度对地理大数据挖掘方法进行分类。未来地理大数据挖掘的研究将面临地理大数据的聚合、挖掘结果的有效性评价以及发现有价值的知识而非常识等几方面的挑战。

关键词: 空间模式, 空间关系, 空间分布, 流空间, 时空异质性, 知识发现

Abstract:

This paper reveals the principle of geographic big data mining and its significance to geographic research. In this paper, big geodata are first categorized into two domains: earth observation big data and human behavior big data. Then, another five attributes except for "5V", including granularity, scope, density, skewness and precision, are summarized regarding big geodata. Based on this, the essence and effect of big geodata mining are uncovered by the following four aspects. First, as the burst of human behavior big data, flow space, where the OD flow is the basic unit instead of the point in traditional space, will become a new presentation form for big geodata. Second, the target of big geodata mining is defined as revealing the spatial pattern and the spatial relationship. Third, spatio-temporal distributions of big geodata can be seen as the overlay of multiple geographic patterns and the patterns may be changed with scale. Fourth, big geodata mining can be viewed as a tool for discovering geographic patterns while the revealed patterns are finally attributed to the outcome of human-land relationship. Big geodata mining methods are categorized into two types in light of mining target, i.e. classification mining and relationship mining. The future research will be facing the following challenges, namely, the aggregation and connection of big geodata, the effective evaluation of mining result and mining "true and useful" knowledge.

Key words: spatial pattern, spatial relationship, spatial distribution, flow space, spatio-temporal heterogeneity, knowledge discovery