A survey on geocoding: algorithms and datasets for toponym resolution

Publication Date: June 10, 2024

Zhang, Zeyu & Bethard, Steven. (2024). A survey on geocoding: algorithms and datasets for toponym resolution. Language Resources and Evaluation. 1-22. 10.1007/s10579-024-09730-2.


Geocoding, the task of converting unstructured text to structured spatial data, has recently seen progress thanks to a variety of new datasets, evaluation metrics, and machine-learning algorithms. Geocoding plays a critical role in tasks such as tracking the evolution and emergence of infectious diseases, analyzing and searching documents by geography, geospatial analysis of historical events, and disaster response mechanisms. To assist those new to this area of research, we provide a survey that reviews, organizes and analyzes recent work on geocoding (also known as toponym resolution) where text is matched to geospatial coordinates and/or ontologies. We summarize the findings of this research, including the domains and databases covered by current geocoding corpora, point-based and polygon-based evaluation metrics, and features and architectures of geocoding systems.