Intriguing Features on the Oxford Map of the English Wikipedia
As a habitual Wikipedia reader, I am particularly intrigued by the map and article entitled “Mapping English Wikipedia” found at Information Geographies (at the Oxford Internet Institute). Here, almost 700,000 dots have been placed on a world map to show the locations of geotagged articles in the English-language Wikipedia. As the authors explain:
Not all articles are geotagged, but almost all articles about events and places tend to be. The data in this map were all taken from November 2011 Wikipedia data dumps. Our project team wrote a script to search for coordinate representations in every article (taking into the varying ways in which geo-coordinates are expressed). We improved the quality of our coordinates by doing things like eliminating or fixing erroneous coordinates, grabbing coordinates (where sensible) from not just structured infoboxes, and making sure to remove irrelevant coordinates (Wikipedia actually contains a lot of coordinates for extra-terrestrial entities like lunar craters!).
The results are interesting. As the authors understatedly note, “there is clearly a lot of unevenness in the amount of content about places, and large parts of our planet are still invisible from these digital augmentations…” The unevenness of coverage is indeed conspicuous, but much of that is to be expected. It is hardly surprising, for example, that vast reaches of sparsely populated land in northern Siberia would be largely by-passed by the Wikipedia. I am more perplexed, however, by the fact that a few uninhabited and remote places, such as South Georgia Island, would be fully covered by yellow dots, whereas some densely populated and easily accessible areas, such as China’s Shandong Peninsula, would be mostly unmarked. (I suspect that the attention given to South Georgia stems in part from popular interest in the survival story of the Shackleton Expedition.) But regardless of this South Georgia oddity, the relative paucity of coverage of China is surely one of the map’s more striking features.
India is much more heavily covered in the English Wikipedia than China, as might be expected, considering the widespread use of English in India along with the British colonial legacy. But colonial legacies as well as the geographies of language are in general not easily seen on the map. Consider, for example, its portrayal of West Africa, visible in the first set of detailed maps. Here the Gambia can be made out, but otherwise political borders are not discernable, even though several of them separate Anglophone from Francophone countries. I have roughly outlined Ghana to emphasize this point. Notice as well the concentrated clusters of dots in Burkina Faso to the north of Ghana. As Burkina Faso is a poor, somewhat marginal, Francophone country, its prominence in the English Wikipedia is noteworthy.
Only in a few parts of the world are political boundaries visible on the map. The clearest example is Eastern Europe; here Poland stands out in sharp contrast to Ukraine and Belarus. The heavy English Wikipedia coverage of Poland is intriguing, as is that of Estonia and Moldova. Estonia is noted for its tech-savvy population, and hence its standing in the encyclopedia is not too surprising, but I am mystified by the blanket coverage of Moldova, Europe’s poorest country.
Equally mysterious to me are the patches of concentrated Wikipedia coverage in upper Burma. As the set of maps showing southern Asia indicates, Wikipedia reporting on India and across much of Southeast Asia matches population distribution relatively well. In southwestern China, however, this connection collapses; sparsely populated Tibet receives roughly the same coverage as densely populated Sichuan. I find it remarkable that one cannot even pick out the major metropolitan areas of Chengdu and Chongqing, both of which stand out very clearly on earth-at-night satellite images.
Political boundaries are evident in several other parts of the world. Armenia and Azerbaijan, for example, are easily discernable, although the “blob of yellow” that covers both countries also oddly extends into Iran, a pattern that is only partially explicable on the basis of population density. Second-order political boundaries are vaguely evident in the Midwest of the United States, where the western and southern boundaries of Minnesota can be distinguished, as can the state boundaries along the Mississippi and Ohio rivers. In the Great Plains of the United States and Canada, linear features on the map correspond to roads and railways. This feature is particularly evident in northern Ontario and Manitoba, where two rail lines appear as long lines of yellow dots.
If any readers have any ideas about the usual features found on this map, I would be very interested to hear them.