Language diversity in Gauteng

In order to mark South Africa’s heritage month, the September 2022 Map of the Month represents the languages spoken in Gauteng. It adds to earlier maps (July 2013 and September 2017) by using a recent dataset - the GCRO Quality of Life Survey 6 (2020/21). During the survey, the 13 616 respondents drawn from all 529 wards in the province were asked ‘What is the main language spoken in this household?’ The results illustrate Gauteng’s cosmopolitan nature resulting from the pull of economic opportunities, initially to gold mines, and then to secondary and tertiary sectors. Most residents of the province are either migrants to the province or descended from earlier generations who migrated to the region after it took off economically in 1886.

Table 1 shows that the most prevalent language in the province is isiZulu - with one in four respondents naming it as the main language of the household. Sesotho is the second most prevalent language at 13.4%, Sepedi is spoken by 12% of households, English by 10.7%, and Setswana by 9.8%. All 11 official languages of South Africa have at least some representation in Gauteng. Furthermore, 3.2% of households in the province speak languages other than South Africa’s official languages at home.


Table 1: Main languages spoken in households in Gauteng.

This language diversity has a complex make-up and geography, shaped by, inter alia:

  1. Colonial policies of labour migration to Johannesburg and other urban centres in the region from labour-providing regions in South Africa, and neighbouring countries.
  2. Efforts to limit urbanisation to only those required for labour. The policy of influx control directed urbanisation that would otherwise have occurred in major urban centres to ethnic ‘homelands’ that occupied land to the north of the province (Fig 5 in Mosiane and Gӧtz 2022). Therefore the former territory of Bophuthatswana, which includes parts of what is now northern Tshwane, has primarily Setswana speakers while the former ‘homeland’ of KwaNdebele, just over what is now Gauteng’s northeast border, is dominated by isiNdebele speakers.
  3. Apartheid’s Group Areas Act, which segregated people living within urban areas by race.
  4. Related policies that tried to segregate people ethnically within townships (Pirie 1984).
  5. More recent changes were enabled, for example, by the upward mobility of previously excluded racial groups into suburbs once reserved for white residents (Crankshaw 2022).

There are different ways of representing this complex geography, each with benefits and limitations. For example, in July 2013 we represented the most prevalent language in Gauteng using the census Small Area Layer (SAL). This gives a striking impression of language distribution in the province. However, this approach does not convey the language diversity within each SAL. Figure 1 below shows an analysis of the GCRO Quality of Life survey 6 (2020/21) data illustrating that in almost 60% of the wards in the province (284 out of 529 wards), the most prevalent language is spoken by less than half of the households in the ward. This is mapped in Figure 2, illustrating that in the majority of wards, the most prevalent household language is not particularly dominant. In some wards (indicated in dark red) more than 75% of households speak the dominant language. This can be seen in wards in areas such as Ga-Rankuwa (south west of Soshanguve), Wonderboom (west of Mamelodi), Laudium (south east of Atteridgeville), Lenasia (to the south of Soweto), Linden (north west of Johannesburg), and Tsakane (south west of Springs). We also see that many of Johannesburg’s more affluent wards (in the northern suburbs) and wards south east of Pretoria (around Centurion) do tend to be more homogenous with respect to the most frequently spoken language by households - and as we shall see below the dominant language here is English. Some wards in townships, too, have a dominant language that is spoken in more than half of the households.


Figure 1: Wards grouped by the proportion of households speaking the most dominant language.


Figure 2: Wards shaded by the proportion of households speaking the most dominant language.

Dot density maps are one way of providing a more granular representation of different languages in an area. However, the limitation of this approach is that the number of languages spoken would result in a visually complex map. In Figure 3, each dot represents one household that reported that their home language was one of the four most frequently spoken languages at home. The data used for mapping was weighted so that it represents the entire municipal population. Although this map does not include the home languages spoken by all respondents and their households, it nevertheless shows the distinct geography of each of these languages. English is concentrated in middle-class suburbs in the core of the province (around Centurion, Sandton [west of Alexandra] and Benoni [west of Daveyton]), Sepedi in Atteridgeville and Mamelodi, IsiZulu in Soweto, Katlehong, and Tsakane, and Sesotho in Sebokeng, Khutsong, and Alexandra.


Figure 3: This map is a dot density of the four most commonly spoken languages spoken by QoL respondents.

In 1984, Gordon Pirie published an article on the policy of ethno-linguistic segregation in townships (Figure 4). Although this was only ever partially implemented, it nevertheless did create ‘Nguni’ reserves, ‘Sotho’ reserves, and ‘Other’ reserves within Soweto. In the decades since the end of this policy, considerable mixing has occurred. Figure 5 zooms into Soweto, and we can see that the distribution of languages in Soweto is shaped by the spatial clustering of three language groups - ‘Nguni’, ‘Sotho’, and ‘Other’. We can see that Nguni languages are clustered around Zola, Zondi, Dube and Dlamini, while Sotho languages are clustered around Naledi, Tladi, Mofolo, and Meadowlands. Other languages are distributed throughout Soweto with some clustering in Diepkloof and Chiawelo. These patterns seem to correlate very closely with the ethno-linguistic segregation that was established in townships during apartheid.


Figure 4: Ethnolinguistic wards in Soweto under apartheid (published in Pirie 1984).


Figure 5: Dot density map of five most commonly spoken languages in Soweto.

Even though dot density maps can offer a more granular picture of language diversity, they contain a limitation of their own as a result of the wording of the question upon which the data is based. Residents of Gauteng are adept at code-switching - and are able to communicate in many different languages in different settings. For example, the main language of a household may well not be the language an individual from the household speaks at work. These limitations to our ability to represent Gauteng’s languages are indicative of the remarkably cosmopolitan society of the province.


Crankshaw, O. (2022) Urban Inequality: Theory Evidence and Method in Johannesburg. London: Zed.

Mosiane, N. and G. Götz. (2022) ‘Governing the GCR Series: Displaced Urbanisation or Displaced Urbanism? Rethinking Development in the Peripheries of the GCR.’ Provocation 8. Gauteng City-Region Observatory.

Pirie, G. (1984) Ethno-linguistic zoning in South African black townships. Area. 16(4): 291-298.

Edits and input: Graeme Götz, Christian Hamann and Ngaka Mosiane.

Recommended citation: Khanyile, S. and Ballard, R. (2022). Language diversity in Gauteng. Map of the Month. Gauteng City-Region Observatory. September 2022.


The GCRO sends out regular news to update subscribers on our research and events.