80% of global internet knowledge skewed by the White 20%
We recently passed a milestone in the history of human connectivity – people online now make up the majority of the world’s population. This has largely gone unnoticed, but it is an important moment and not just for statistical reasons.
North American and European internet users now make up only about a quarter of the world’s users. Furthermore, while countries like the US and the UK have almost reached internet saturation, Africa, Asia and Latin America are home to billions more people who will come online in the next few years.
The networking of humanity is no longer confined to a few economically prosperous parts of the world. For the first time in history, we are creating a truly global and accessible communication network. However, while access to the internet is quickly being democratized, research by us for the Geonet project at the Oxford Internet Institute shows that web content remains heavily skewed towards rich, western countries.
All of sub-Saharan Africa combined, despite having 10% of the world’s internet users, registers only 0.7% of the world’s domain names (a good proxy for how much web content is produced) and 0.5% of the world’s commits (or revisions) to GitHub (a proxy for how much computer code people write and share in a place). France alone produces 5.7 times more GitHub commits and 3.4 times more domain registrations than all the sub-Saharan countries.
The skewed geography and gender of Wikipedia edits is perhaps even more concerning. Research shows that the vast majority of content on Wikipedia written about most African countries is written by (primarily male) editors in Europe or North America. Wikipedia is one of the most used websites in the world and an important data source for countless platforms and services.
So 20% of the world or less shapes our understanding of 80% of the world. This causes an amplification of geographical and gendered biases, including on search engines like Google. If you are using Google to search for local information in Belgium, Canada or Australia, you will be directed to primarily locally produced content. But if you’re in Sierra Leone, Pakistan or Indonesia, almost all content is produced by outsiders.
For a Nigerian woman going online, there are hardly any Wikipedia biographies of the women she reads about in a national newspaper. You might speak Mandarin, Bengali or Arabic, all of which are in the top 10 most spoken languages. But there are only 52,000 articles in the Bengali Wikipedia – a language spoken by 237 million people, while the Dutch Wikipedia has nearly 2 million articles for a country whose language is spoken by 28 million people.
In a world riven by stereotypes and discrimination, the internet should be challenging the biases of our physical world, not deepening them. In fact, the internet could well serve as a digital space that reflects and produces the richness of our world’s multiple forms of knowledge, through a combination of text, voice and visuals.
As Google projected a few years ago, the world has nearly 130 million books in at least 480 languages. Yet in a world of nearly 7,000 languages and dialects, that means only about 7% of our languages are in published material. We need to do much more to capture the oral knowledge of our past and present.
So how do we make the internet look more like the world we live in? Those of us who make up most of the world need to bring our information and knowledge online, and all of us – wherever we are from – need to help make it happen.
A number of individuals, groups and campaigns have been working to make the internet more diverse and plural. Wikimujeres (and similar initiatives in different languages) work on increasing the number of women’s biographies from Latin America on the Spanish Wikipedia.
Wiki Loves Africa expands the number of high-quality images from African countries, and Afrocrowd works to create and improve information on black culture and history on Wikipedia. Organizations such as the Association of Progressive Communications focus on women’s rights and knowledge in internet and telecommunications policies.
Whose Knowledge? is a global, multilingual campaign that works with these groups and beyond, to centre the histories and knowledge of the majority of the world that is underrepresented on the internet. For instance, in 2016 when we began, we worked with scholars from the Kumeyaay Native American community of southern California, on the Wikipedia article about the California gold rush, to reflect its deeply negative impact on Native American communities.
In April, with our partners in Equality Labs, we held a Wikipedia editing session to include information about the 350 million people of India’s Dalit community, and wrote about inspirations like Grace Banu – the first transgender Dalit person to be admitted to an engineering college in Tamil Nadu. At the same time, we are working with communities such as the Dalit and the Kumeyaay to archive their oral knowledge and histories.
Google and other key mediators of information should have a responsibility to ensure that communities around the world are not flooded with foreign content, and that the internet begins to resemble the network for billions that it is meant to be. But we – as users – also have a responsibility to question the perspectives presented to us by the Googles and the Wikipedias of the world, and perhaps also to change them: to edit, to create, and to build the internet we want to see.
Mark Graham and Anasuya Sengupta
Mark Graham is professor of internet geography at the Oxford Internet Institute in the University of Oxford and a faculty fellow at the Alan Turing Institute. His research focuses on digital labour, digital geographies and inequalities in our digital world
Anasuya Sengupta is the co-founder and coordinator of Whose Knowledge? – a global, multilingual campaign to reimagine the internet to be for and from us all. She is supported by a fellowship from the Shuttleworth Foundation for this work
Originally published by TheGuardian as We’re all connected now, so why is the internet so white and western?