SlideShare ist ein Scribd-Unternehmen logo
1 von 46
1
Determinants and Distribution of the South
African Labour Market Income, Evidence
from the South African National Income
Dynamic Study
Kevin Rodrigues (RDRKEV001) & Chay Stockdale (STCCHA002)
University of Cape Town
Abstract:
Inequality is a major problem in South Africa with the South African Gini coefficient
estimated to be 0.65 in 2014 (World Bank, 2015). This paper investigates the distribution and
determinants of labour market income, so that emphasis with regards to policy
recommendation can be placed on significant determinants to reduce income inequality. The
paper estimates the effect of 16 independent variables, 12 of which are categorical, on a
labour market income ordinary least squares multiple linear regression model. Wave 3 of the
National Income Dynamics Study is used as a sample of the South African population. A
number of variables were found to have a significant impact on labour market income with
education level having the largest individual impact. Other significant variables included;
age, tenure, occupation, sector, province, geographic area, race, gender, union membership,
average hours, health status, marital status and English writing ability. It is recommended that
policy interventions be focused on improving the education level among other
recommendations that are likely to improve incomes of individuals living in South Africa.
Keywords: Income Determinants; Income Distribution; NIDS Wave 3, South Africa
Introduction:
Inequality is a major problem in South Africa, with South Africa’s Gini Coefficient estimated
to be 0.65 (World Bank, 2015). It is important to understand the distribution and determinants
2
of labour market income, within the country, such that income gaps can be closed by
focusing on improving wages through these determinants. This paper seeks to evaluate the
determinants and distribution of individual labour market income among the Employed
Economically Active South African Population (excluding the self-employed), using the
National Income Dynamics Study (NIDS) wave 3, 2012, data. The paper initially details a
literature review, describing relevant past research on the topic. From this a description of the
data used and manipulation of the data is provided. This paper then draws on past research to
clarify the selection of the dependent and independent variables of the model, and provides
expectations of the relationships. Descriptive statistics are provided for the variables,
showing an analysis of the data and underlying population through various summary
statistics, pie graphs and cross-tabulations. The regression procedure and regression results
are then provided, with comments for each independent variable in comparison to prior
expectations. From this policy and intervention recommendations targeting the most
significant determinants of labour market income are provided. The paper finally ends with a
regression diagnostic, testing whether the assumptions of multiple linear regression held in
the model.
Literature Review:
Human Capital Theory describes factors that determine income in the labour market. Many
studies have been performed in an attempt to capture the determinants and distribution of
income. Given the legacy of apartheid, most South African studies in this area focus on
income inequality amongst race and gender. International research provides insight and offers
empirical evidence supporting other determinants of income.
Most models isolating the determinants of income are an expansion of Mincer’s (1974)
earnings function, which premised earnings as a positive function of education and work
experience. Work experience is expressed as a quadratic function in Mincer’s earnings
function, this was an adjustment to prior studies that showed that earnings was a quadratic
function of age (Mincer, 1974:46). By making work experience quadratic Mincer (1972:46)
was able to capture the earnings-age relationship. Education is linearly related to earnings, as
Mincer (1974:47) observed that the profiles for work experience to earnings were
approximately parallel for different education levels. The model has been proven to hold
through extensive regressions. A regression verifying Mincer’s model was performed on the
3
South African 2005 and 2006 Income and Expenditure Survey data (Botha, 2010:143). The
outcome of the regression showed that earnings were significantly and positively related to
work experience and education as theorised (Botha, 2010:143).
UK research performed by Miles (1997:23), on micro household data, reconfirms the
education and work experience variables, as significant determinants of income, and expands
the function to demonstrate that household income significantly varies according to the
household’s geographical location and gender of the head of the household. His research
reported that a female head of household earned on average lower than a male head, and a
person's earnings varied according to the city a person lived in (Miles, 1997:23). In South
Africa, it was discovered that a household that is within an urban area and headed by a white
male is less likely, than any other geographical or demographic classification, to live below
the poverty line (Botha, 2010:142).
The statistics based on the IES for 2010 and 2011 showed that the average annual household
income for African Blacks is R69 632, compared to the white population group that has an
average of R387 011 (Statistics South Africa, 2012:11). The white population’s share of the
total income exceeds what it should be according to group’s population size (Statistics South
Africa, 2012:12). The Gini coefficient for South Africa was last measured in 2014 at 0.65,
compared to Brazil’s at 0.53 (World Bank, 2015). Gini coefficient of 1 represents perfect
inequality. Considering that this income inequality is portrayed through race in South Africa
(Statistics South Africa, 2012:12). Race can be considered a determinant of income.
Research performed by Van den Berg and Louw (2003:19) found that income is unequally
distributed both within and between race groups in South Africa. While rising black incomes
have contracted the inter-racial income gap, the rising intra-racial inequality has caused an
increase in the Gini coefficient (Van den Berg and Louw, 2003:19). This finding of race as a
significant determinant of income has been confirmed by The South African Reserve Bank’s
2006 report on the determinants of public and private-sector wages in South Africa. (Bosch,
2006:9)
Bosch (2006:22) found that an individual's industry and sector, defined as either private or
public, significantly influenced their income. While individuals generally earn higher wages
in the private sector, the public sector showed a fairer distribution across all wages with a
lower maximum wage (Bosch, 2006:24). The report revealed that wages were distributed
4
more fairly across gender and population group in the public sector (Bosch, 2006:24). This
could be due to the public sector embodying equity policy (Bosch, 2006:24). In the same
report Bosch found that job profession and union membership significantly affect South
African earnings (Bosch, 2006:9). It was found that those in higher skilled professions and
those who were union members had greater earnings figures (Bosch, 2006:9). Daniels and
Rospabé (2005:12) confirm these findings about job profession and union membership.
In addition to the above findings on job profession, union membership, education attainment,
work experience, race, age, gender, geographical location and industry, Daniels and Rospabé
(2005:11) found that marriage, household headship, nature of employment activity (formal or
informal) and monthly hours worked significantly affect an individual’s wage. By performing
a regression analysis on the 1999 Statistics South Africa October Household Survey data they
found that on average married people earn significantly more than those that are not married
(Daniels and Rospabé, 2005:11). On average those that are the head of a household earn
significantly higher wages that those that are not (Daniels and Rospabé, 2005:11). Individuals
involved in formal economic activities earn significantly higher income than those involved
in informal activity. And finally an individual’s monthly hours worked significantly affects
their income earned (Daniels and Rospabé, 2005:11).
Data and Methods:
Data Description
This study uses Wave 3 of the National Income Dynamics Study (NIDS) to investigate the
distribution and determinants of labour market income in South Africa. The study is limited
to the employed, economically active population, excluding those that are self-employed.
NIDS Wave 3 Data
The NIDS survey is a face-to-face longitudinal study of individuals in South Africa and the
people living in their households. It is a combination of multiple individual and household
level questionnaires covering various aspects of the individuals and households including,
amongst other things, their demographics, household characteristics, income and employment
information, health status and education attainment. The NIDS Wave 3 Data is the latest
5
cross section of the individuals followed throughout the NIDS study collected in 2012
(Southern Africa Labour and Development Research Unit, 2013:2).
Quality of the NIDS Wave 3 Data
The quality of the data that underpins the analysis was integral for accurate determination of
the independent variables. The NIDS Wave 3 provides a board range of household and
individual data. A major reason for the survey was to measure income mobility within South
Africa and thus much of data regarding earnings was relevant to the analysis.
The Wave 3 cross sectional data consists of the 8040 successful household interviews and 32
633 successful individual interviews (Southern Africa Labour and Development Research
Unit, 2013:4). After preparing the data for the regression analysis in this particular study
there was 6210 observations left. The data was collected on a survey designed with the
intention to be nationally representative, and thus relevant for defining the determinants of
labour market income for South Africa.
Data Manipulation
The Wave 3 NIDS was manipulated using the statistical package STATA 14. This study is
limited to the employed, economically active population, employees only, excluding the self-
employed.
Preparation and Manipulation Process of Data
1. Data Merging:
All separate databases contained in the NIDS Wave 3 study were merged into one
database before manipulations.
2. Removal of observations that were not successfully collected:
All observations in the NIDS Wave 3 database that were not successfully collected have
been deleted, only those surveys, which were successfully collected will be used in our
analysis. This solves many problems including removing those people that are deceased
and removing people that are members of two households, as only one survey will be
6
successfully completed per person and mapped to the household they were in when the
survey was completed.
3. Removing all people below 15 years of age and above 65 years of age:
This study is limited to the economically active population, as defined by the Department
of Labour of South Africa, this excludes all people below 15 and above 65 (Department
of Labour, 2015). Thus all people below the age of 15 and above the age of 65 have been
removed from the sample considered.
4. Deleting all discouraged, unemployed and not economically active individuals:
This study is limited to those that are employed, excluding the self-employed. Thus all
observations that are identified as unemployed, discouraged and not economically active
have been removed.
5. Removing Child and Proxy data:
The child and proxy questionnaires used in the NIDS wave 3 study have insufficient
information relating to employment and employment related factors to perform a
consistent comprehensive analysis on the determinants of labour market income. These
observations have thus been removed from the sample.
6. Construction of a Labour Market Income variable:
The labour market income variable has been constructed by using an individual variables
(w3_fwag) used by NIDS in their aggregation of household labour market income.
w3_fwag is the wage from a person’s primary and secondary occupations. This variable is
collated across all respondents. Should a response be a bracket response the mid-point of
the bracket has been taken. For those that indicated that they received income above the
highest band, double the upper value of the highest band has been used for their income.
The 4.97% missing values have been imputed using regression.
The labour market income variable was derived by taking w3_fwag and subtracting net
secondary occupation income (em2pay) greater than zero. The income values have been
deflated such that they are recorded in August 2012 monetary terms.
7
7. Deleting all observations where labour market income is zero:
It has been assumed that all those that have a labour market income of zero are either self-
employed or not economically active, yet incorrectly recorded as being economically
active. They thus do not form a part of this analysis. All observations with a labour
market income of 0 have been deleted.
8. Construction of a Tenure variable:
To determine tenure the start date in months and years in current occupation was
subtracted from the date of the NIDS interview in months and years. Negative values,
meaning that the individual was starting employment after the interview date, were
removed.
9. Cleaning and Categorizing Variables:
Following this the prospective independent variables were cleaned and categorized for
use in the analysis.
10. Weighting
To assure that the inference drawn from this study is applicable in the national context the
NIDS wave 3 survey data collected was weighted using post-stratified sampled weights.
Following the cleaning and manipulation of the NIDS Wave 3 survey data the original
sample size was reduced to 6210 observations.
Dependent Variable
The labour market income variable constructed reflects a monthly income from the primary
occupation of an individual. The distribution of the labour market income variable was
heavily skewed and non-normal. This skewedness in income is confirmed by Daniels and
Rospabé’s (2005:6) findings. The best transformation, approximating a normal distribution,
of labour market income was the natural log of the variable. The dependent variable used in
this analysis is the natural log of the labour market income.
8
Independent Variables
Each independent variable chosen was supported by theoretical reasoning and empirical
evidence from prior research.
Age
Mincer’s (1974:46) original equation theorised age as a proxy for work experience, as age
increases so does work experience and thus income. Age-earnings profiles empirically
support a significant positive relationship between age and earnings (Murphy, 1990:202).
Age squared
The age squared term captures the effect of age on an individual’s productivity. Beyond a
certain age productivity is expected to decline. The age squared term is expected to have a
negative relationship with income. This reasoning is similar to the reasoning Mincer
(1974:50) provided for the work experience squared term, which involved the over-
specialisation of an individual such that he becomes less employable. This reasoning is
empirically supported by regressions on age-earnings profiles, which show that the age
squared term significantly reduces income (Murphy, 1990:203).
Education
Mincer’s (1974:46) earnings function expressed a positive linear relationship between
education and income. Higher education is theorised to allow an individual to perform more
complicated tasks, which are rewarded with a higher income. Education’s positive impact on
income is empirically supported by Daniels and Rospabé (2005).
For the purposes of this study, education years have been recoded and categorized into no
schooling, some/complete primary schooling, some secondary schooling or secondary
equivalent, matric or equivalent, diploma, undergraduate degree and postgraduate
qualification. The base category used was no schooling
9
Tenure
The longer an employee is with a company the more likely the employee becomes part of the
core operations of the business, and hence the more valuable to the company. In return the
employee’s salary should increase. This reasoning supports the inclusion of the tenure
variable into the model. Consistent with the findings of Daniels and Rospabé (2005) it is
expected that as tenure increases so does labour market income.
Average Hours Worked per week
The more hours an individual works in a week the greater his salary/wage should be for the
week, if paid overtime or on an hourly basis. This simply reasoning warranted the inclusion
of this variable. Consistent with the findings of Daniels and Rospabé (2005) it is expected
that as average hours worked increases so does labour market income.
Union Membership
Being a union member should increase income through the wage bargaining process.
Consistent with the findings of Bosch (2006:23) union membership is expected to have a
significant positive impact on income. The base category in the following analysis is that an
individual is not a union member.
Occupation
Occupations that require more complex skills should pay higher salaries. Bosch (2006) finds
that occupations that require more skill tend to pay larger salaries. The base occupation
category in this analysis is elementary occupations, which, requiring the lowest skill level are
expected to have the lowest income relative to other occupations.
Sector
It is expected that certain sector’s pay premium wages. This is evident in the minimum
sectoral wages, for instance, the agricultural minimum sector wage is lower than the
minimum sector wage for the mining sector (Bosch, 2006:20). Bosch (2006) finds that there
are sectors which pay wage premiums resulting in greater income for the individual
employed in those sectors. The following analysis uses the private household sector as the
10
base category, it is expected that this sector pays the lowest premium wages. It is expected
that the manufacturing and transportation sector pay the highest wage premium, as these
sectors have the highest minimum sectoral wages (Benhura and Gwatidzo, 2013:9).
Gender
According to 2014 tax statistics, females earn on average a third of what males earn (South
African Revenue Services, 2014:9). Bosch (2006) supports this, finding that female workers
on average earn lower wages, in line with this it is expected that females will earn less than
males. The base category for the following analysis was female.
Race
Due to the legacy of apartheid, inequality in South Africa often manifests on racial lines
(Statistics South Africa, 2012:12). Van den Berg and Louw (2003) find race to be a relevant
factor in determining income in South Africa.
The African race is used as the base category in the analysis. It is expected that an individual
belonging to the White race category will earn more labour market income than any other
category, due to this race being the benefactors of apartheid.
Marital Status
An individual who is married is expected to have more responsibilities and commitments
than a single individual and therefore has a greater incentive to earn more income. Consistent
with Daniels and Rospabé (2005) it is expected that married individuals earn more than single
individuals. This variable has been categorised into married and unmarried, the base category
is unmarried.
Head of Household
An individual who is the head of household has more responsibilities than an individual who
is not, and therefore has a greater incentive to earn more income. In line with Miles (1997)
household headship is expected to be a significant determinant of income. The base category
in this analysis is that an individual is not the head of a household.
11
Health Status
The sicker an individual is the less likely the individual will be willing and able to work and
the less reliable they will be as an employee. This should reduce their labour market income.
Health status is categorised into two categories: good to excellent and fair to poor. The base
category for this analysis is poor to fair health.
Geographic Area
In line with the 2012 Income and Expenditure of Households report by Statistics South Africa
it is expected that the population living in rural areas earns on average less than those in
urban areas. The base category used for this variable is rural.
Province
It is expected that provinces with higher GDP’s have a greater amount of economic activity
occurring in them, resulting in premium wages being paid in these provinces. Daniels and
Rospabé (2005) find that province has a significant impact on earnings.
The base category for the following analysis is the Eastern Cape, as this province has a
relatively low GDP and had the lowest income levels in 2013 (Statistics South Africa,
2013:46). It is expected that individuals living the Gauteng Province and Western Cape
Province will earn the highest labour market incomes as these two provinces have the highest
GDP levels and income levels, and are economic hubs of South Africa (Statistics South
Africa, 2013:46).
English Writing Level
English is the most common language used in commercial South Africa. It this hypothesised
that an individual with a greater proficiency in English will be open to better opportunities
and thus earn higher income. The base category for this analysis is no English writing
abilities.
12
Descriptive Statistics:
Distribution of Labour Market Income
Table 1: Summary Statistics for Labour Market Income
Summary Statistics of Labour Market Income
Observations 6,181 Percentiles
1% 200
Population Weight of
Observations 12 973 193 5% 501.0225
10% 812.4787
Mean 5514.503 25% 1585.44
50% 3100
Standard Deviation 7421.482 75% 6351.852
90% 11890.8
Variance 55 100 000 95% 18092.31
99% 36329.06
Skewness 4.348522
Kurtosis 34.85887
Figure 1: Histogram showing the distribution ofLabour Market Income in South Africa
13
Variable Observations Weight Mean Standard Deviation Minimum Maximum Mode
Age 6,181 12973193.00 37.06728 10.4499 15 64 32
Education Years 6,173 12964349.40 10.46122 3.225948 0 18 12
Race 6,181 12973193.00 African
Gender 6,181 12973193.00 Male
Occupation 5,384 11575098.90 Elementery Occupations
Sector 5,283 11272926.30 Community Social and Personal Services
Summary Statistics of Age, Education, Race, Gender, Occupation and Sector
The above histogram, Figure 1, gives us a picture of the distribution of labour market income
in the South African population. Only people with incomes below R100 000 have been
plotted on the histogram to enhance readability. It is clear that the South African population
has a highly unequal distribution of income. This is confirmed by World Bank’s estimated
Gini coefficient of 0.65 stated in the literature review (2015). The distribution of income in
South Africa is heavily skewed to the right. The monthly labour market income of the South
African population is highly concentrated around the peak in the distribution; with the
majority of people earning less than R10 000 per month income.
This is further informed by the summary statistics in Table 1 above. The skewness of 4.35
deviates widely from the skewness of 0 of the normal distribution, confirming the skewness
apparent in the graph to the right. The kurtosis of 34.85 is an indication of how clustered the
observations of South African labour market income are around the median value of R3 100.
50% of all employed economically active South Africans (excluding the self-employed) earn
between R6352 and R1 585. To further illustrate how clustered monthly income values are
below the value of R10 000, 90% of all employed economically active South Africans earn
below R11 890. Table 2, which follows illustrates the summary statistics of the explanatory
variables of labour market income.
Summary Statistics of population
Table 2: Summary Statistics of Age, Education Years, Race, Gender, Occupation and
Sector
Age
The minimum and maximum age for the population is 15 years and 64 years respectively.
The sample was cleaned to remove all people below 15 and over 64, as the formal definition
for the economically active population includes people from 15 to 64 (Department of Labour,
14
Race Group Frequency Percent Cumulative
1. African 4,499.9077 72.8 72.8
2. Coloured 680.609865 11.01 83.81
3. Asian/Indian 193.057955 3.12 86.94
4. White 807.424465 13.06 100
Total 6,181 100
2015). The average age for the employed economically active population is 37.07 years.
With a standard deviation of 10.45 years. The most common age within the population is 32
years.
Education years
The minimum education years is 0 meaning a person has no schooling. The maximum is 18
years meaning that a person either has a Master's degree or Doctorate. The average number of
years of education in the population is 10.46 years with a standard deviation of 3.23 years.
The average of 10.46 years is approximately equivalent to Grade 10 school qualification.
The most common amount of years of education, within the population, is 12 years, which is
equivalent to a matric. After determining the summary statistics in years the education
variable was categorised into; no schooling, some/complete primary schooling, some
secondary schooling or secondary equivalent, matric or equivalent, diploma, undergraduate
degree and postgraduate qualification.
Race
Race is defined as, 1. African, 2. Coloured, 3. Asian/Indian and 4. White. The mean,
minimum and maximum do not provide meaningful information, as they are based on the
number demarcating the race outcome, and have been remove from the summary statistics.
The mode, illustrates that the largest race population, in the overall population, is African.
Table 2.1: Summary Statistics for Race
Table 2.1, illustrates the group percentages according to race. The table shows that 72.8% of
the population is African, which is approximate to Statistics South Africa’s 2015 mid-year
results that had the African race at 80.5% of the population. The Asian/Indian race is group is
the smallest race group only representing 3.12% of the population.
15
Gender
The mode shows that the population consists of more males than females.
Table 2.2: Summary Statistics for Gender
Table 2.2, illustrates that 57.63% of people in the considered population are male while
42.37% of people in the population are female.
Occupation
The mode shows that the most common occupation, within the population, is elementary
occupations. Elementary occupations is defined as, paid jobs that require the completion of
simple routinely tasks, which could require the use of hand tools or physical exertion
(International Labour Organisation, 2015).
Table 2.3: Summary Statistics for Occupation
Table 2.3, illustrates that the majority of people in our sample are involved in elementary
occupations at 25.89%. This in line with the mean and mode for education of years, as
elementary occupations require limited or no education. A European paper suggests that as
educational attainment increases, the majority of the population should be occupations that
require higher education, furthermore as technology increases less workers will be required in
16
Sector Frequency Percent Cumulative
0. Private households 504.885079 9.56 9.56
1. Agriculture,hunting, forestry and fishery 379.136345 7.18 16.73
2. Mining and Quarrying 390.284465 7.39 24.12
3. Manufacturing 515.741469 9.76 33.88
4. Electricity, gas and water supply 162.799577 3.08 36.96
5. Construction 323.03887 6.11 43.08
6. Wholesale and Retail trade; repair etc 388.817069 7.36 50.44
7. Transport, storage and communication 405.4604513 7.67 58.11
8. Financial intermediation, insurance, etc 338.959187 6.42 64.53
9. Community, social and personal services 1,676.1041 31.73 96.26
10. Catering and accommodation 197.77335 3.74 100.00
Total 5,283 100
elementary occupations, but those workers would require higher education (CEDEFOP,
2012:10).
The skilled agriculture, forestry and fishery occupation consists of the smallest percentage at
0.28%, which is consistent with the relative slow mechanisation in South Africa in this sector
(Aliber & Simbi, 2000:2).
Sector
The mode shows that the most common sector to work in, within the in population, is the
community, social, and personal services sector.
Table 2.4: Tabulation of Sector variable
Table 2.4, illustrates that 31.73% of the population works in the community, social and
personal service sector, this represents the largest concentration of the population according
to sector. The South African Reserve Bank show that 8.8% of the total wage bill for the
private sector was attributable to the community, social and personal services sector (Bosch,
2016:19).
The smallest sector is electricity, gas and water supply sector, which employs 3.08% of the
population.
17
Area Mean Standard Deviation Frequency Population Proportion Observations
Traditional 3416.5542 4134.0827 2 424 755.00 18.69% 1,524
Urban 6277.1992 8142.2836 9 670 880.00 74.55% 3,911
Farms 2906.2061 3337.3939 877 558.00 6.76% 746
Total 5514.503 7421.4825 12 973 193.00 100.00% 6,181
Summary Statistics of Labour Market Income by Geographic Area Type
Analysis of the income by covariates
Geographical area type
Table 3: Summary Statistics of Labour Market Income by Geographic Area Type
Table 3, illustrates the average monthly income earned by the portion of the population living
in traditional areas, urban areas, and farm areas is R3416.55, R6277.20 and R2906.21
respectively. People living urban areas earn on average the highest monthly income. People
living in farm areas earn on average the lowest monthly income. The difference between the
urban and farm areas average monthly income is R3370.99, which is approximately equal to
the average monthly income earned in traditional areas. The standard deviation for urban
areas is R8142.28 and has the greatest dispersion in labour market income. The standard
deviation for traditional and farm areas is R4134.08 and R3337.39.
Figure 2: Pie Graph of Monthly Labour Market Income across Geographic Areas
18
Race Mean Standard Deviation Frequency Population Proportion Observations
African 4115.7063 5176.7689 9444778 72.80% 4,518
Colour 4560.2441 5628.6571 1428520 11.01% 1,314
Asian/Indian 11183.206 12048.051 405206 3.12% 88
White 12759.201 11829.756 1694689 13.06% 261
Total 5514.503 7421.4825 12 973 193.00 100.00% 6,181
Summary Statistics of Labour Market Income by Race Group
Figure 2, above illustrates that 84.86% of the total monthly income is earned in urban areas,
despite that only 74.55% of the population lives in urban areas. Figure 2, further illustrates
that 11.58% of total monthly income is earned in traditional areas and 3.56% is earned in
farm areas. Even though 18.69% of population live in traditional areas and the remaining
6.76% live in farm works. This means that people that live in urban earn on average more
than those in traditional and farm areas, which is illustrated by the means in table 3.
Following this traditional and farm areas were combined to form rural areas, due to their
similarity.
Race Groups
Table 4: Summary Statistics of Labour Market Income by Race Group
Table 4, illustrates the average monthly income earned according to race populations.
African, Coloured, Asian/Indian and White population groups earn a monthly average of
R4115.71, R4560.24, R11183.21, and R12759.20 respectively. The White group has the
highest average closely followed by the Asian/Indian group. The African group has the
lowest average. The Coloured average is R444.53 above the African average. The difference
between the White and African average is R8643.50, which is approximately equal to the
sum of the average monthly income earned by the African and Coloured groups.
The Asian/Indian group has the highest standard deviation of R12048.05, meaning this group
has the greatest dispersion in monthly income. The White group also has a high standard
deviation of R11829.76, compared to the African and Coloured groups that have a standard
deviation of R5176.77 and R5628.66. Therefore, the African and Coloured groups have a
relatively low dispersion in monthly income.
19
Figure 3: Pie Graph of Monthly Labour Market Income across Race Group
Figure 3, above illustrates that 54.34% of the total monthly income is earned by the African
population group, despite that 72.80% of the population is African. Figure 3, further
illustrates that 30.22% of the total monthly income is earned by the White population, 9.11%
by the Coloured population and 6.33% by the Asian/Indian population. Even though 13.06%
of the population is White, 11.01% of the population is Coloured and the remaining 3.12% of
the population is Asian/Indian. This means that White and Asian/Indian population groups
earn more of the total monthly income than would be expected, according the size of the race
group in respect to the total population.
Gender Groups
Table 5: Summary Statistics of Labour Market Income by Gender
20
Table 5, above provides evidence of discrepancies in income between different gender
groups. The mean income for males is R6076.22, while that for females is R4 750.46. Males
on average earn more than females and have a higher dispersion in their earnings, as they
have a standard deviation of R8233.59, compared to the standard deviation for females of
R6065.69.
Figure 4: Pie Graph ofMonthly Labour Market Income across Genders
As illustrated in Figure 4 above, males earn 63.5% of all monthly labour market income in
South Africa, while females earn 34.5% of all monthly labour market income. This is despite
the fact that the considered population consists of 57.63% males and 42.36% females. Males
earn disproportionally more income that females.
21
Age Cohorts
Table 6: Summary Statistics ofLabour Market Income by Age Cohort
Table 6, above provides summary statistics of Labour market income by age cohort, from this
table we can deduce a relationship between monthly labour market income and age. The
lowest mean wage (R2 146.24) is earned by those within the 15-20 year age category, while
the highest mean wage (R7 939.87) is earned by those in the 50-55 year age category.
A person’s income tends to increase as they get older until they reach a peak income within
the age band 50 to 55. Income seems to increase faster than it decreases with increases up to
the peak income age cohort being between R800 and over R1000 and decreases being below
R500.
Age cohorts above 40 earn a higher mean monthly income than the population mean. Those
below 40 tend to earn a lower monthly income than the population mean value.
The dispersion of income also seems to increase as people get older, save for the initial two
categories, 15-20 and 20-25 that deviate from the trend. Dispersion of income tends to
increase from the age cohort 20-25 year with a standard deviation of R2 665.15 to a
maximum of R12 095.09 in the 55-60 age cohort, this then decreases as with the mean wage.
Age cohorts between 40 and 60 tend to have wages that are more dispersed than the wages of
the entire population.
22
Figure 5: Pie Chart ofMonthly Labour Market Income by Age Cohort
Figure 5, above indicates that despite individuals in the age cohort 50-55 years earning the
highest mean income, this group only earns 6.519% (4th lowest proportion) of all monthly
labour market income earned by the population. This is due to the fact that they occupy only
8.39% of the population under consideration as an age cohort, which is relatively smaller than
the other age cohorts that can occupy up to 17.28% of the population under consideration.
The cohorts that earn the top three proportions of all population monthly labour market
income, in descending order, are age cohort 35-40, age cohort 40-45 and age cohort 30-35.
These population groups form the middle income bracket considering their means but occupy
much higher proportions of the population, 16.62%, 13.68% and 17.28% respectively.
The cohorts that earn the bottom three proportions of all population monthly labour market
income, with the lowest proportion holder first, are age cohort 15-20, age cohort 60+ (60-64)
and age cohort 20-25. These population groups occupy 1.45%, 1.77% and 9.6% of the total
population under consideration, respectively. The population cohorts 15-20 and 20-25 earn
the lowest mean incomes as well.
23
Regression Method
The effects of the above mentioned independent variables on log labour market income have
been estimated through an ordinary least squares multiple linear regression. This estimation
method estimates the effects of the independent variables on the log labour market income by
fitting a line of best fit that minimises the sum of the squared variation between the log labour
market income values predicted by the estimation line and the actual log labour market
income values. Categorical independent variables in the regression model affect the intercept
of the estimated regression line with reference to some base category, and numerical
independent variables affect the slope of the predicted regression model.
In order for ordinary least squares (OLS) multiple linear regression parameters to be unbiased
- meaning that on average, if we take infinitely many samples from the same population and
estimate the effect using this method, the estimated effect of the independent variable on the
dependent variable in the sample is equal to the effect on the independent variable on the
dependent variable in the population - four assumptions must hold. First, the population must
be linear in parameters, in the underlying population model the dependent variable must be
linearly related to the independent variables and some error term. Second, a random sample
from the underlying population must be collected, each member of the population must have
the same probability of being included in the sample. Third, there must be no perfect
collinearity between independent variables, none of the independent variables must be
constant and there must be no exact linear relationship among independent variables. Fourth,
the expected value of the error term (u) - the factors other than the independent variables that
affect the dependent variables - given any values of the independent variables must be equal
to zero.
In order for the estimated parameters to be considered the best linear unbiased estimators -
estimators with the least variance when compared to all other linear unbiased estimators, of
the underlying population parameters - a fifth assumption must hold. This assumption is the
assumption of homoscedasticity, where the error term has the same variance given any value
of the independent variables.
For the estimated sample parameters to be considered minimum variance unbiased estimators
- estimators with the smallest variance of all unbiased estimators - a sixth assumption must
24
hold. This is the assumption that the population error is independent of the explanatory
variables and normally distributed with zero mean and variance of σ2. Assumptions 1-6 above
are the classical linear model assumptions. This paper will proceed assuming that the
classical linear model assumptions hold for the regression output and analysis section. These
assumptions will then be tested.
Model Estimation
Following the initial regression of log labour market income on the above specified
independent variables the initial estimated regression coefficients are obtained. These results
are included in the appendix. To determine whether robust standard errors should be applied
to this model, two diagnostics are run. The first, a Breusch-Pagan test for heteroscedasticity.
Second, a residuals vs fitted graph is plotted, both to determine whether heteroscedasticity is
present in the model.
The results of the Breusch-Pagan test for heteroscedasticity, table 7 below, suggest that
heteroscedasticity is not a problem with this model as we fail to reject the null hypothesis that
the error term has a constant variance, at the 5% level. The Breusch Pagan test however only
detects linear heteroscedasticity and is sensitive to violations in normality (Williams, 2015).
A residuals vs fitted values plot is thus analysed to further test for heteroscedasticity.
Table 7: Breusch-Pagan Test for Heteroskedasticity
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013;
own calculations)
Observing the residuals vs fitted plot, in figure 6 below, we find that residuals for higher
fitted values tend to be slightly lower than those for lower fitted values, implying
heteroscedasticity. Thus assumption three, as specified above, of constant error variance is
chi2 = 0.37
Prob > chi2 = 0.5451
Ho: Constant variance
Variables: Fitted Values of Log Labour Market Income
Breusch-Pagan / Cook-Weisberg Test for Heteroskedasticity
25
violated as residuals have a lower variance for higher values of x than for lower values of x.
To account for the failure of this assumption and reduce the bias in the calculation of the
standard errors of our estimated coefficients robust standard errors are applied in the
estimation of our regression model. The final results of the regression model with robust,
after the application of robust standard errors are listed in table 8 below.
Figure 6: Residuals vs Fitted Plot
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013;
own calculations)
-6-4-2
02
Residuals
6 7 8 9 10 11
Fitted values
26
Table 8: Final RegressionAnalysis Results with Robust Standard Errors Applied
(1)
VARIABLES Model 1
Constant 5.237***
(0.247)
Household Headship 0.0179
(0.0345)
Age 0.0325***
(0.0105)
Age^2 -0.000332**
(0.000131)
Tenure 0.0172***
(0.00244)
Average Hours Worked Per Week 0.00329***
(0.00119)
Education Level(Base Categyory: No Schooling)
Completed or Some Primary 0.0774
(0.0987)
Some Secondary or Secondary Equivalent 0.216**
(0.102)
Matric or Matric Equivalent 0.522***
(0.108)
Diploma 0.945***
(0.120)
Degree 1.317***
(0.147)
Postgraduate Degree 1.112***
(0.196)
English Writing Level Proficiency(Base Category: Not at all)
Very well 0.208***
(0.0798)
Fair 0.0749
(0.0810)
Not well -0.0227
(0.0777)
Province (Base Category: Eastern Cape)
Western Cape 0.201***
(0.0755)
Northern Cape 0.176*
(0.103)
Free State 0.174**
(0.0880)
KwaZulu-Natal 0.187***
(0.0697)
North West 0.350***
(0.0779)
Gauteng 0.365***
(0.0677)
Mpumalanga 0.457***
(0.0688)
Limpopo 0.179**
(0.0751)
27
(1)
VARIABLES Model 1
Gender(Base Category:Female)
Male 0.263***
(0.0371)
Trade Union Membership (Base Category:No)
Part of a Trade Union 0.217***
(0.0387)
Primary Occupation (Base Category: Elementary Occupations)
Managers 0.640***
(0.104)
Professionals 0.423***
(0.0722)
Technicians and associate professionals 0.426***
(0.0818)
Clerical support workers 0.213**
(0.102)
Service and sales workers 0.165***
(0.0453)
Skilled agricultural, forestry and fishery workers 0.570***
(0.118)
Craft and related trades workers 0.133**
(0.0561)
Plant and machine operators, and assemblers 0.0692
(0.0598)
Economic Sector of Primary Occupation (Base Category: Private Households)
Agriculture,hunting, forestry and fishing 0.0648
(0.0661)
Mining and Quarrying 0.579***
(0.0864)
Manufacturing 0.226***
(0.0784)
Electricity, gas and water supply 0.332***
(0.103)
Construction 0.105
(0.0986)
Wholesale and Retail trade; repair etc;hotels and restaurants 0.100
(0.0831)
Transport, storage and communication 0.305***
(0.0887)
Financial intermediation,insurance,real estate and business services 0.258**
(0.115)
Community, social and personal services 0.223***
(0.0723)
Catering and accommodation 0.193**
(0.0848)
Geographic Location Type(Base Category: Rural)
Urban 0.0974***
(0.0372)
Race Group (Base Category: African)
Coloured 0.0763
(0.0692)
Asian/Indian 0.000442
(0.162)
White 0.509***
(0.0762)
28
Results and Analysis
Age and Age squared
The estimated effect of age confirms the hypothesis that age has a parabolic relationship with
labour market income, first increasing until it reaches a turning point and then decreasing.
The approximate percentage change a one year increase in age has on income can be
determined using the function: 3.25 – 0.066(present age). This is in line with Mincer’s
(1974:46) original findings.
The age variable was significant at the 1% level, whereas the age squared variable was only
significant at the 5% level. This lower significance could be due to the age interval used in
the analysis, which ranged from 15-64 years. The turning point of the age equation was 49.26
years, beyond this point additional years reduced labour market income. The sample excluded
many people that were beyond the turning point. This may explain the lower significance of
age squared variable.
Education
The results for education confirms the hypothesis; on average greater education increases
labour market income. People with greater than primary school education, on average, earn
significantly more than those without any schooling. The effect education has on income
increases in size in proportion to education level, except for in the case of postgraduate
degree level education, which has a coefficient slightly smaller than the coefficient for
undergraduate degree level education. One potential reason for this is that an individual’s
(1)
VARIABLES Model 1
Marital Status (Base Category: Not Married)
Married 0.147***
(0.0403)
Perceived Health Status(Base Category:Poor to Fair)
Good to Excellent 0.233***
(0.0809)
Observations 3,906
R-squared 0.614
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations)
29
time may be better spent attaining work experience than a postgraduate degree after
completing an undergraduate degree.
Undergraduate degree level education has the greatest impact on an individual’s labour
market income, with an estimated labour market income of approximately 131.7% more than
individuals with no schooling. Education levels of diploma and above have a larger estimated
average impact on log labour market income than any individual characteristic.
Tenure
The initial hypothesis is correct, as tenure increases so does labour market income. The
tenure variable is significant at the 1% level, and contributes to the explaining how labour
market income is determined. This result is consistent with Daniels and Rospabé’s (2005:11)
findings previously stated. A 1 year increase in an individual’s tenure will increase that
individual’s labour market income by approximately 1.72%. Tenure’s impact on labour
market income will be smaller than age’s positive impact on labour market income for
individuals younger than 23 years. Beyond 23 years of age, the estimated positive effect of an
additional year at a firm has larger increase in labour market income than a 1 year increase in
age.
Average Hours Worked per week
An increase in average hours worked per week significantly increases labour market income,
this is consistent with the initial expectation. The effect of an increase in average hours
worked is significant at the 1% level. This result is consistent with Daniels and Rospabé’s
(2005:11) findings previously stated. A 1 hour increase in average hours worked per week
increases labour market income by approximately 0.329% on average.
Union Membership
The results for union membership are in line with the previously stated hypothesis. The effect
of union membership is significant at the 1% level. Union membership increases an
individual’s labour market income on average by approximately 21.7% compared to
individuals that are not union members, this increase confirms the findings of Bosch (2006).
This large increase is a testament to the power of unions in South Africa.
30
Occupation
The results support the expectation that occupations with more complex job requirements pay
more. This is supported by the finding that all occupations are estimated to have a positive
impact on income when compared to the base category, elementary occupations. The
manager occupation had the largest impact on labour market income, followed by skilled
agricultural, forestry and fishery occupation, significant at the 1% level. A manager earns
approximately 64% more labour market income than an individual involved in an elementary
occupation.
Income earned by plant and machine operators and assemblers is not significantly different
from income earned by those involved in elementary occupations. The remaining occupations
have a significant effect on labour market income at either the 5% or 1% level.
Sector:
All sectors are estimated to pay more than the private household sector. Income earned by
those involved in the agricultural, hunting, forestry and fishery, construction and wholesale
and retail trade sectors are not significantly different from the private household sector, at the
10% level. In contrast to expectations, the mining and quarrying sector pays the highest wage
premium, an individual working in mining and quarrying sector earns approximately 57.9%
more than an individual in the private household sector. This is followed by the electricity,
gas and water supply sector where an individual earns approximately 33.2% more than an
individual in the private household sector. The 3rd and 4th highest paying sectors are the
transportation, storage and communication sector and manufacturing sector.
A possible reason for the given sector results is that the mining and quarrying sector, as well
as the electricity, gas and water supply sector, to a lesser degree, are highly unionised and
active in wage negotiations, and are both heavily regulated by government (Benhura and
Gwatidzo, 2013:10). Further research shows that although the transportation sector and
manufacturing sector have the highest sectorial determinations, the actual wages paid in these
sectors is below these regulated minima (Benhura and Gwatidzo, 2013:10).
31
Gender
The results are consistent with the prior stated predications. Males are expected to earn
approximately 26.3% more labour market income, on average, than females. The gender
variable therefore significantly explains variation in log labour market income. This provides
evidence of gender wage discrimination.
Race
The results show that only the income earned by those in the White race category is
significantly different from individuals in the African race category at the 1% significance
level. Income earned by those in the Coloured and Asian/Indian race categories are not
significantly different from incomes earned by those in the African race category. This is
evidence of the legacy of apartheid, where White individuals still earn significantly more than
previously disadvantaged individuals.
A White individual will earn, approximately 50.9% more than an African individual. This
effect is twice as large as the effect that union membership has, and nearly twice as large as
the effect gender has, on log labour market income. This illustrates that there is significant
wage inequality on racial lines.
Marital Status
In line with expectations, an individual who is married will earn, on average, approximately
14.7% more than a single individual. This result is significant at the 1% significance level.
Head of Household
The head of household variable does not significantly explain variation in log labour market
income at the 10% significance level. This is contrary to the findings of Miles (1997). This
could be because, since 1997, there has been a shift towards dual income households in
society, with more than one household member earning a salary (Pew Research Centre,
2015). There is thus less onus on the head of the household to provide all of the household
income.
32
Health Status
In line with expectations an individual with good to excellent health earns approximately
23.3% more labour market income than an individual with poor to fair health, significant at
the 1% level. Health status appears to have a larger effect on labour market income than the
marital status and union membership, which reflects the relative importance of this variable
in determining labour market income.
Geographic Area:
As expected an individual living in an urban area earns more than an individual living in a
rural area, significant at the 1% level. The results show that an individual living in an urban
area approximately earns 9.74% more labour market income than an individual living in a
rural area. Comparatively both health status and marital status appear to have larger effect on
labour market income than geographic area.
Province
The province categories were all significant determinants of income relative to the Eastern
Cape, at the 10% significance level. Most provinces were significant at the 1% level. As
expected inhabitants of all provinces were estimated on average to earn greater labour market
income than inhabitants of the Eastern Cape.
The results are inconsistent with the prior stated assumptions regarding, which province
would have the largest effect on log labour market income. Although Gauteng had a large
coefficient, Mpumalanga had the largest coefficient at 0.457. This means that an individual
living in Mpumalanga earns, approximately, 45.7% more labour market income than an
individual living in the Eastern Cape. Further research needs to be conducted to explain why
Mpumalanga had the greatest effect on log labour market income.
Gauteng had the second largest effect on log labour market income, which is consistent with
the prior stated reasoning. An individual in Gauteng earns approximately 36.5% more labour
market income than an individual living in the Eastern Cape.
Being from the Western Cape, on the other hand, had a smaller effect on labour market
income than being from the North West province. A potential reason for why being from the
33
North West province has a higher effect, could be due to the higher concentration of mines in
the North West relative to the Western Cape (Statistics South Africa, 2015:62). Mines in
South Africa have stronger unions relative to other sectors, which impacts wages negotiations
(Gernetzky, 2015:1).
English Writing Level
An individual with a very good English writing level earns, approximately, 20.8% more
labour market income than an individual with no English writing ability, significant at the 1%
level. This is consistent with the prior stated expectations. Good English writing ability,
enhances an individual’s earnings potential.
Policy Recommendations
The education variable has the largest expected impact on labour market. If the overall level
of education in the country is improved income levels should rise, reducing inequality.
Therefore it is essential that South Africa reduces the dropout rate within the education
system. A recommendation is that the compulsory education requirement should be raised
from grade 9 to matric. This should increase the student retention ratio in the schooling
system and result in more employees with matric qualifications. Government could also pass
a policy requiring private schools to fund or provide discounted fees for a certain number of
underprivileged children, in ratio to the size of the school. This should increase the
availability to better schooling for the underprivileged, and support a reduction inequality.
Government should emphasise the importance of English at schools and ensure that teachers
are competent and qualified to teach. This could be done through better incentive
programmes, stricter screening of potential teachers and improvement of school
administration.
Occupation has a large impact on labour market income. Government should develop and
implement apprenticeship programs that train individuals and connect them with employers
in sectors with high wage premiums, providing an opportunity for people to move out of
elementary occupations and into higher paying occupations.
34
Minimum sectorial wages in certain sectors should either be implemented, increased or
reinforced, especially in the private households sector that pays the lowest wages. The
minimum sector wage in the transportation sector is not effectively reinforced and thus is
ignored by employers (Benhura and Gwatidzo, 2013:10). The minimum sector wage for the
agriculture, hunting, forestry, and fishing sector should be increased, and then strictly
regulated. Due to the sparseness of the employees in the farming industry is it difficult for
unions to be active, due to transportation costs (Benhura and Gwatidzo, 2013:10).
Government could reinforce the Board-Based Black Economic Empowerment Act to reduce
income inequality on racial lines.
Government should encourage healthy living programs and improve the state of national
healthcare to counteract the impact of poor health on wages. This could include interventions
like tax incentives for healthy food and healthy choice and a full review of the administration
of national healthcare services.
Regression Diagnostics
Various tests and observations of the data and sampling methods can be made to determine
whether the assumptions under which the above regression model was estimated hold or not.
Assumption 1: Linear in Parameters
As noted above, for the estimated regression coefficients to be unbiased the underlying
population model should be linear in parameters. The relationship between the natural log of
labour market income and the independent variables outlined in this papers model must be
linear. Considering the complex mechanism by which a person’s wage is determined it is
highly unlikely that wage in reality will be exactly linearly related to some independent
variable.
One means of further testing this assumption is observing a scatter plot of the numerical
independent variables against the dependent variable. Observing the leftmost column of
figure 7 of the natural log of labour market income versus the numerical independent
variables, we do not observe a clear linear relationship between log labour market income
35
and the numerical independent variables. This is indicated by the vast range in the
independent variables observed across different levels of log labour market income. There is
however, an upward tendency in each independent variable indicating some linearity between
the two, though this is not clear.
The above noted high unlikelihood and the observed scatter plot of log labour market income
vs numerical independent variables leads us to believe that log labour market income is not
exactly linear in parameters in the population. This is means the above estimated regression
coefficients are likely to be biased.
Figure 7: Graph Matrix of Log Labour Market Income vs Numerical Independent
Variables
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013;
own calculations)
Assumption 2: Random Sampling
The NIDS wave 3 dataset is a cross sectional dataset used as a sample representing the South
African population, thus in order for the estimated regression coefficients to be unbiased each
Log
Labour
Market
Income
Best
age -
years
agesq
tenure
lnhoursworked
0 5 10
20
40
60
20 40 60
0
2000
4000
0 2000 4000
0
50
0 50
0
2
4
6
36
member of the South African population would have to have an equal probability of being
included in the sample. The NIDS wave 3 dataset is the third cross-section of data collected
from households that were selected before the initial cross-sectional survey (NIDS wave 1)
was conducted (De Villiers et. al., 2013).
Selection was done at the Household level for the NIDS database. Extensive effort was put
into the initial random selection of households for the first wave of NIDS data, thus it is
reasonable to assume from a household selection perspective that households had an equal
probability of initial inclusion in the NIDS wave 1 dataset (De Villiers, Leibbrandt and
Woolard, 2009). As NIDS is a panel of data there is an inherent selection bias created as the
random selection is not conducted again for the NIDS wave 3 instalment, households that
have moved or split up are however still tracked slightly mitigating this effect (De Villiers et.
al., 2013).
Even though the households that were attempted to be surveyed were randomly selected there
may be inherent biases created by the response rates of survey participants dependent on their
attitudes to being surveyed i.e. there tends to be a higher attrition rate for White and Asian
population groups (De Villiers et. al.,2013). This would create a bias and thus a non-random
sample as the likelihood of these population groups being included in the dataset would be
greatly reduced. This is a violation of the random sampling assumption. This problem is
partially mitigated through the use of sample weighting in our regression output.
Assumption 3: No Perfect Collinearity
The third assumption in the classical linear regression model relates to the relationship
between independent variables. For the estimated regression coefficients to be unbiased there
would have to be no perfect collinearity between independent variables. Perfect collinearity
occurs when one independent variable can be expressed as a function of one or many other
independent variables. When running our regression no variables were identified as being
perfectly collinear by Stata, there is thus no perfect collinearity between independent
variables in the regression model. The third Classical linear regression model assumption
thus holds.
Despite the fact that there is no perfect collinearity between the regression coefficients, high
levels of multicollinearity - high correlation - between independent variables could also be
37
problematic at this would result unnecessarily inflated standard errors of the estimated
regression coefficients. To test for high multicolinearity the variance inflation factors for each
of the independent variables used in the model was calculated. High variance inflations
factors could be a cause for concern, as they could result in regression results being reported
as insignificant when they are in fact significant. This problem is largely mitigated by the
large sample size use of 3906 observations.
The variance inflation factors of the independent variables used in the specified regression
model are listed in table 9 below. The majority of the variance inflation factors seem to be
very low, below 3, indicating a low linear correlation between independent variables, there
are some variables that have variance inflation factors that stand out. Variables with relatively
higher variance include age, age squared, all the variables in the education category
particularly some secondary or secondary equivalent and matric or matric equivalent and all
variables in the English writing ability category, particularly very well and fair.
It is logical the age and age squared would be highly linearly related to other independent
variables as age squared is derived from age, the high linear correlation between these
variables however does not affect our interpretation as they are significant in the model
despite their high variance inflation factor. This seems to also hold for the education category
variables and English writing proficiency variables as well as those variables with the highest
linear correlations still come out to be significant in the model.
38
Variable VIF 1/VIF Variable VIF 1/VIF
Household Headship 1.15 0.866461 Occupation
Age 57.27 0.01746 Managers 1.38 0.725352
Age^2 57.47 0.017402 Professionals 2.25 0.443468
Tenure 1.92 0.52215 Technicians and associate professionals 1.28 0.778701
Average Hours Worked Per Week 1.09 0.913501 Clerical support workers 1.41 0.706957
Education Category Service and sales workers 1.77 0.56543
Completed or Some Primary 8.43 0.11864 Skilled agricultural, forestry and fishery workers 1.03 0.974026
Some Secondary or Secondary Equivalent 20.25 0.049383 Craft and related trades workers 1.48 0.675968
Matric or Matric Equivalent 20.44 0.048928 Plant and machine operators, and assemblers 1.68 0.593825
Diploma 9.5 0.105314 Sector
Degree 3.85 0.259421 Agriculture,hunting, forestry and fishing 1.86 0.537143
Postgraduate Degree 4.15 0.241221 Mining and Quarrying 2.11 0.474574
English Writing Ability Manufacturing 2.07 0.48383
Very well 10.05 0.099489 Electricity, gas and water supply 1.36 0.735153
Fair 7.1 0.140917 Construction 1.81 0.553698
Not well 3.47 0.288062 Wholesale and Retail trade; repair etc;hotels and restaurants 1.82 0.548861
Province Transport, storage and communication 2.02 0.496063
Western Cape 3.26 0.306755 Financial intermediation,insurance,real estate and business services 1.8 0.556207
Northern Cape 1.44 0.693214 Community, social and personal services 3.76 0.26607
Free State 1.97 0.507873 Catering and accommodation 1.49 0.671778
KwaZulu-Natal 2.77 0.36145 Geographical Area: Urban 1.67 0.59914
North West 2.21 0.452445 Race Group
Gauteng 4.7 0.212903 Coloured 1.76 0.568912
Mpumalanga 2.17 0.461312 Asian/Indian 1.01 0.989399
Limpopo 2.18 0.459004 White 1.45 0.68931
Gender: Male 1.37 0.727353 Marital Status: Married 1.36 0.735863
Health Status: Good-Excellent 1.07 0.930983
Union Membership: Yes 1.4 0.714639
Table 9: Variance Inflation Factors of Independent Variables
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013;
own calculations)
Assumption 4: Zero Conditional Mean
The zero conditional mean assumption holds if the expected value of the error term given any
of the independent variables is equal to zero. This assumption will fail if the model is
misspecified and there is a variable excluded from the model that is significant in explaining
variation in log labour market income and correlated with another independent variable.
There are multiple variables that have not been accounted for in our model that could have an
effect on log labour market income, these include inherent mental ability, parent’s education,
overall work experience, attitude towards work and leadership ability. Inherent mental ability
and education category are likely to be correlated as those with greater mental ability tend to
have education. This is likely to cause a positive bias as education and mental ability are
probably positively related to each other and mental ability is probably positively related to
labour market income. Attitude towards work is likely to be positively related to average
hours worked per week and the natural log of labour market income. This will probably result
in the coefficient on average hours worked being positively biased.
39
Performing a Ramsay RESET test for omitted variables, table 10 below, we can confirm the
above justification for an omitted variables and thus a misspecified model. The null
hypothesis of this test is rejected at the 5% level. It can be concluded that there are significant
omitted variables that are related to the independent variables and the zero conditional mean
assumption does not hold.
Table 10: Output of the Ramsay RESET test for omitted variables
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013;
own calculations)
Assumption 5: Homoskedasticity
As discussed above in the Regression Model section, this model exhibits heteroskedasticity.
This has been accounted for through the application of robust standard errors.
Assumption 6: Normality of Residuals
The normality of residuals assumption is used to obtain the exact sampling distribution of t
and F statistics such that hypothesis testing can be carried out on the OLS regression model
and coefficients.
Observing the kernel density plot of the OLS residuals versus the normal distribution, in
figure 8 below, it can be gathered that the residuals are not normally distributed. This is
further confirmed by the Shapiro-Wilks test for normality, in table 11 below that is rejected at
the 1% level.
The statistical tests performed to determine the significance of the regression coefficients
above however should not entirely be abandoned. Due to the large sample size used of 3 906
observations we can conclude, using the central limit theorem, that the OLS estimators satisfy
the asymptotic normality condition and are thus approximately normally distributed.
Ramsay RESET test using powers of the fitted values of log labour market income
Ho: model has no omitted variables
F(3, 3854) = 4.81
Prob > F = 0.0024
40
Figure 8: Kernel Density Plot of the OLS Residuals versus the Normal Distribution
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013;
own calculations)
Table 11: Shapiro-Wilks test for normality on residuals
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013;
own calculations)
0
.2.4.6.8
Density
-6 -4 -2 0 2
Residuals
Kernel density estimate
Normal density
kernel = epanechnikov, bandwidth = 0.0915
Kernal Density Plot for Residuals
H0: Data is normally distributed
Variable Observations W V z Prob>z
Residuals 3,906 0.94613 117.035 12.398 0
Shapiro-Wilk W test for normal data
41
Conclusion:
The determinants of labour market income were estimated using an ordinary least squares
multiple linear regression model. The results showed that the majority of the independent
variables included were significant. Prior research provided reasoning as to why variables
included were significant. The education variable had the largest impact on log labour market
income, with an undergraduate degree education level increasing income by approximately
131.7%, on average, compare to no education. Sector, province, occupation and race also had
large impacts on log labour market income. It was found that the white race earns
significantly more than any other race, illustrating that racial inequality is still present in
South Africa.
To reduce income inequality in South Africa a number of recommendations were provided.
Predominantly it was advised that interventions be implemented in the education system. If
government could increase the compulsory education requirement from grade 9 to matric, it
should increase student retention and result in more matric qualifications in the workplace.
Expanding on this, private schooling should not only be available to the elite. If it could be
mandatory for private schools to fund a certain number of underprivileged children through
the education system, this would supplement a reduction in racial inequality. Furthermore,
emphasis should be placed on the need to learn English. Government could assist by
implementing a stricter screening processes for English teachers, as well investing in
incentive programs and stream lining school administration. The government could establish
apprenticeship programs to help individuals enter higher wage premium sectors.
It was found that wages varied across sector with the private household sector earning the
least. Further research confirmed that certain sector minimum wages were ineffective, and
actual wages paid were lower. Government should implement and increase minimum sector
wages in the private household sector and the agriculture, hunting, forestry, and fishing
sector, and should reinforce existing minimum sector wages.
In the testing the regression model assumptions it was found that assumption 1, linearity in
parameters and assumption 4, zero conditional mean, were violated. The results of the study
should thus be interpreted with caution as there is likely to be bias in the regression
coefficients. The homoscedasticity and normality of residuals assumptions were also found to
be violated, this was accounted for through the application of robust standard errors and the
use of a large sample.
1
(1)
VARIABLES Model 1
Constant 5.237***
(0.247)
Household Headship 0.0179
(0.0213)
Age 0.0325***
(0.00732)
Age^2 -0.000332***
(9.30e-05)
Tenure 0.0172***
(0.00155)
Average Hours Worked Per Week 0.00329***
(0.000670)
Education Level(Base Categyory: No Schooling)
Completed or Some Primary 0.0774
(0.0859)
Some Secondary or Secondary Equivalent 0.216**
(0.0900)
Matric or Matric Equivalent 0.522***
(0.0923)
Diploma 0.945***
(0.0974)
Degree 1.317***
(0.110)
Postgraduate Degree 1.112***
(0.109)
English Writing Level Proficiency(Base Category: Not at all)
Very well 0.208***
(0.0660)
Fair 0.0749
(0.0654)
Not well -0.0227
(0.0663)
Province (Base Category: Eastern Cape)
Western Cape 0.201***
(0.0509)
Northern Cape 0.176**
(0.0749)
Free State 0.174***
(0.0550)
KwaZulu-Natal 0.187***
(0.0488)
North West 0.350***
(0.0568)
Gauteng 0.365***
(0.0431)
Mpumalanga 0.457***
(0.0527)
Limpopo 0.179***
(0.0542)
Gender(Base Category:Female)
Male 0.263***
(0.0227)
Trade Union Membership (Base Category:No)
Part of a Trade Union 0.217***
(0.0243)
Appendix:
Table IR: Initial regression output
2
(1)
VARIABLES Model 1
Primary Occupation (Base Category: Elementary Occupations)
Managers 0.640***
(0.0511)
Professionals 0.423***
(0.0417)
Technicians and associate professionals 0.426***
(0.0500)
Clerical support workers 0.213***
(0.0476)
Service and sales workers 0.165***
(0.0322)
Skilled agricultural, forestry and fishery workers 0.570***
(0.207)
Craft and related trades workers 0.133***
(0.0406)
Plant and machine operators, and assemblers 0.0692*
(0.0380)
Economic Sector of Primary Occupation (Base Category: Private Households)
Agriculture,hunting, forestry and fishing 0.0648
(0.0503)
Mining and Quarrying 0.579***
(0.0538)
Manufacturing 0.226***
(0.0469)
Electricity, gas and water supply 0.332***
(0.0699)
Construction 0.105*
(0.0557)
Wholesale and Retail trade; repair etc;hotels and restaurants 0.100**
(0.0506)
Transport, storage and communication 0.305***
(0.0527)
Financial intermediation,insurance,real estate and business services 0.258***
(0.0538)
Community, social and personal services 0.223***
(0.0397)
Catering and accommodation 0.193***
(0.0603)
Geographic Location Type(Base Category: Rural)
Urban 0.0974***
(0.0292)
Race Group (Base Category: African)
Coloured 0.0763*
(0.0437)
Asian/Indian 0.000442
(0.583)
White 0.509***
(0.0385)
Marital Status (Base Category: Not Married)
Married 0.147***
(0.0235)
Perceived Health Status(Base Category:Poor to Fair)
Good to Excellent 0.233***
(0.0400)
Observations 3,906
R-squared 0.614
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations)
1
Reference List:
Aliber, M., Simbi, T. (2000): Agricultural Employment Crisis in South Africa. Trade and
Industrial Policy Secretariat (TIPS). Department of Trade and Industry. 2
Benhura, M., Gwatidzo, T. (2013): Mining Sector Wages in South Africa. Labour Market
Intelligence Partnership (LMIP). 9-10
Bosch, A. (2006): Determinants of public and private-sector wages in South Africa. Research
Department. South African Reserve Bank. 9-24
Botha, F. (2010): The impact of educational attainment on household poverty in South
Africa. Department of Economics and Economic History. Rhodes: University of Rhodes.
142-143
CEDEFOP. (2011): Labour-market polarisation and elementary occupations in Europe.
Luxembourg: Publications Office of the European Union. 10
Daniels, R., Rospabé, S. (2005): Estimating an Earnings Function from Coarsened Data by
and Interval Consored Regression Procedure. Development Policy Research Unit. Cape
Town: University of Cape Town. 6-12
Department of Labour. (2015): Commission for Employment Equity Annual Report 2014 to
2015. Department of Labour. vii
De Villiers, L., Leibbrandt, M. Woolard, I. (2009): Methodology: Report on NIDS Wave 1.
Cape Town: Southern Africa Labour and Development Research Unit
De Villiers, L., Brown, M., Woolard, I., Daniels, R.C., Leibbrandt, M. (2013): National
Income Dynamics Study Wave 3 User Manual. Cape Town: Southern Africa Labour and
Development Research Unit
Gernetzky, K. (2015): Mining industry, unions and government agree on plan to stem job
losses. Business Day, available at:
http://www.bdlive.co.za/business/mining/2015/08/31/mining-industry-unions-and-
government-agree-on-plan-to-stem-job-losses [2015, October 01]
2
International Labour Organisation. 2015. International Standard Classification of
Occupations. Available: http://www.ilo.org/public/english/bureau/stat/isco/isco88/9.htm
[2015, October 02].
Leibbrandt, M., Woolard, I.1999. Household Incomes, Poverty and Inequality in a
Multivariate Framework. Development Policy Research Unit University of Cape Town. 6-12
Miles, D. 1997.A household level study of the determinants of income and consumption. The
Economic Journal. 107(1):23-24.
Mincer, J. 1974. Schooling, Experience, and Earnings. New York National Bureau of
Economic Research working paper no. 9362. New York: NBER
Murphy, K. (1990). Empirical Age-Earnings Profiles Chicago: The University of Chicago
Press. 202-203
Pew Research Centre, 2015. The Rise of Dual Income Households [Online]. Available:
http://www.pewresearch.org/fact-tank/2014/06/12/5-facts-about-todays-fathers/ft_dual-
income-households-1960-2012-2/ [2015, October 02]
South African Revenue Services. (2014): Tax Statistics – Highlights. South African Revenue
Services and National Treasury. 9
Southern Africa Labour and Development Research Unit. National Income Dynamics Study
2012, Wave 3 [dataset]. Version 1.2. Cape Town: Southern Africa Labour and Development
Research Unit [producer], 2013. Cape Town: DataFirst [distributor], 2013
Statistics South Africa. (2012): Income and Expenditure of Households. Report; no P0100.
Johannesburg: Statistics South Africa. 11
Statistics South Africa. (2013): Gross Domestics Product: Regional and Annual estimates.
Report; no P0441. Johannesburg: Statistics South Africa. 46
Statistics South Africa. (2015): Mid-year population estimate. Report; no P0302. Pretoria:
Statistics South Africa. 2
3
Statistics South Africa. (2015): Quarterly Labour Force Survey - Quarter 1. Report; no
P0211. Pretoria: Statistics South Africa. 62
Van den Berg, S., Louw, M. (2003): Changing patterns of South African income distribution:
Towards time series estimates of distribution and poverty. Paper to the Conference of the
Economic Society of South Africa, 17-19 September 2003. Stellenbosch: University of
Stellenbosch. 19-23
Williams R. (2015). Heteroskedasticity. Available: https://www3.nd.edu/~rwilliam/stats2/l25
[2015, October 01].
World Bank. (2015): GINI Index World Bank Estimate. The World Bank Development
Research Group. Available at: http://data.worldbank.org/indicator/SI.POV.GINI [2015,
October 02]

Weitere ähnliche Inhalte

Was ist angesagt?

Unemployment and its determinants a study of pakistan economy (1999-2010)
Unemployment and its determinants a study of pakistan economy (1999-2010)Unemployment and its determinants a study of pakistan economy (1999-2010)
Unemployment and its determinants a study of pakistan economy (1999-2010)Alexander Decker
 
The role of informal sector in alleviating youth unemployment
The role of informal sector in alleviating youth unemploymentThe role of informal sector in alleviating youth unemployment
The role of informal sector in alleviating youth unemploymentDr Lendy Spires
 
Youth population and the labour market of pakistan a micro level study
Youth population and the labour market of pakistan  a micro level studyYouth population and the labour market of pakistan  a micro level study
Youth population and the labour market of pakistan a micro level studyAdam Azad
 
Possible Decision Rules to Allocate Quotas and Reservations to Ensure Equity ...
Possible Decision Rules to Allocate Quotas and Reservations to Ensure Equity ...Possible Decision Rules to Allocate Quotas and Reservations to Ensure Equity ...
Possible Decision Rules to Allocate Quotas and Reservations to Ensure Equity ...Nabraj Lama
 
A Comparative Analysis of the Level of a State’s Economic Development with th...
A Comparative Analysis of the Level of a State’s Economic Development with th...A Comparative Analysis of the Level of a State’s Economic Development with th...
A Comparative Analysis of the Level of a State’s Economic Development with th...James Darnbrook
 
The unheard soliloquy of ngo employees in bangladesh- some empirical findings
The unheard soliloquy of ngo employees in bangladesh- some empirical findingsThe unheard soliloquy of ngo employees in bangladesh- some empirical findings
The unheard soliloquy of ngo employees in bangladesh- some empirical findingsAlexander Decker
 
E442745
E442745E442745
E442745aijbm
 
Survey on perceptions and knowledge of corruption (jun 2014)
Survey on perceptions and knowledge of corruption (jun 2014)Survey on perceptions and knowledge of corruption (jun 2014)
Survey on perceptions and knowledge of corruption (jun 2014)Bilguun Jargalsaikhan
 
Human Resource Planning and Gender Mainstreaming In the Sugar Sector: A Surve...
Human Resource Planning and Gender Mainstreaming In the Sugar Sector: A Surve...Human Resource Planning and Gender Mainstreaming In the Sugar Sector: A Surve...
Human Resource Planning and Gender Mainstreaming In the Sugar Sector: A Surve...paperpublications3
 
EDUCATION AS A PATHWAY TO SUSTAINABLE GROWTH IN NIGERIA
EDUCATION AS A PATHWAY TO SUSTAINABLE GROWTH IN NIGERIAEDUCATION AS A PATHWAY TO SUSTAINABLE GROWTH IN NIGERIA
EDUCATION AS A PATHWAY TO SUSTAINABLE GROWTH IN NIGERIApaperpublications3
 
Effect of Fiscal Independence and Local Revenue Against Human Development Index
Effect of Fiscal Independence and Local Revenue Against Human Development IndexEffect of Fiscal Independence and Local Revenue Against Human Development Index
Effect of Fiscal Independence and Local Revenue Against Human Development IndexUniversitas Pembangunan Panca Budi
 
Done_Kaemba1991InformalSectorEmplymentYouthZambia
Done_Kaemba1991InformalSectorEmplymentYouthZambiaDone_Kaemba1991InformalSectorEmplymentYouthZambia
Done_Kaemba1991InformalSectorEmplymentYouthZambiaLukwesa Kaemba
 
02 Aug 2012 - NREGA impact - Labor Market Effects of Social Programs
02 Aug 2012 - NREGA impact - Labor Market Effects of Social Programs02 Aug 2012 - NREGA impact - Labor Market Effects of Social Programs
02 Aug 2012 - NREGA impact - Labor Market Effects of Social ProgramsCSISA
 
NAWB Strengthening LMI Connections
NAWB Strengthening LMI ConnectionsNAWB Strengthening LMI Connections
NAWB Strengthening LMI ConnectionsGary Crossley
 
Case Study on a Digital Marketing & SEO Company (Biswadeep Ghosh Hazra) - [P...
Case Study on a Digital Marketing & SEO Company (Biswadeep Ghosh Hazra) -  [P...Case Study on a Digital Marketing & SEO Company (Biswadeep Ghosh Hazra) -  [P...
Case Study on a Digital Marketing & SEO Company (Biswadeep Ghosh Hazra) - [P...Biswadeep Ghosh Hazra
 
Slideshare pro poor growth
Slideshare pro poor growthSlideshare pro poor growth
Slideshare pro poor growthAtiek Bu
 

Was ist angesagt? (19)

Unemployment and its determinants a study of pakistan economy (1999-2010)
Unemployment and its determinants a study of pakistan economy (1999-2010)Unemployment and its determinants a study of pakistan economy (1999-2010)
Unemployment and its determinants a study of pakistan economy (1999-2010)
 
The role of informal sector in alleviating youth unemployment
The role of informal sector in alleviating youth unemploymentThe role of informal sector in alleviating youth unemployment
The role of informal sector in alleviating youth unemployment
 
Youth population and the labour market of pakistan a micro level study
Youth population and the labour market of pakistan  a micro level studyYouth population and the labour market of pakistan  a micro level study
Youth population and the labour market of pakistan a micro level study
 
Possible Decision Rules to Allocate Quotas and Reservations to Ensure Equity ...
Possible Decision Rules to Allocate Quotas and Reservations to Ensure Equity ...Possible Decision Rules to Allocate Quotas and Reservations to Ensure Equity ...
Possible Decision Rules to Allocate Quotas and Reservations to Ensure Equity ...
 
A Comparative Analysis of the Level of a State’s Economic Development with th...
A Comparative Analysis of the Level of a State’s Economic Development with th...A Comparative Analysis of the Level of a State’s Economic Development with th...
A Comparative Analysis of the Level of a State’s Economic Development with th...
 
The unheard soliloquy of ngo employees in bangladesh- some empirical findings
The unheard soliloquy of ngo employees in bangladesh- some empirical findingsThe unheard soliloquy of ngo employees in bangladesh- some empirical findings
The unheard soliloquy of ngo employees in bangladesh- some empirical findings
 
E442745
E442745E442745
E442745
 
Survey on perceptions and knowledge of corruption (jun 2014)
Survey on perceptions and knowledge of corruption (jun 2014)Survey on perceptions and knowledge of corruption (jun 2014)
Survey on perceptions and knowledge of corruption (jun 2014)
 
Human Resource Planning and Gender Mainstreaming In the Sugar Sector: A Surve...
Human Resource Planning and Gender Mainstreaming In the Sugar Sector: A Surve...Human Resource Planning and Gender Mainstreaming In the Sugar Sector: A Surve...
Human Resource Planning and Gender Mainstreaming In the Sugar Sector: A Surve...
 
Sustainability 11-03686-v2
Sustainability 11-03686-v2Sustainability 11-03686-v2
Sustainability 11-03686-v2
 
EDUCATION AS A PATHWAY TO SUSTAINABLE GROWTH IN NIGERIA
EDUCATION AS A PATHWAY TO SUSTAINABLE GROWTH IN NIGERIAEDUCATION AS A PATHWAY TO SUSTAINABLE GROWTH IN NIGERIA
EDUCATION AS A PATHWAY TO SUSTAINABLE GROWTH IN NIGERIA
 
Lmi Demystify62910
Lmi Demystify62910Lmi Demystify62910
Lmi Demystify62910
 
Effect of Fiscal Independence and Local Revenue Against Human Development Index
Effect of Fiscal Independence and Local Revenue Against Human Development IndexEffect of Fiscal Independence and Local Revenue Against Human Development Index
Effect of Fiscal Independence and Local Revenue Against Human Development Index
 
Done_Kaemba1991InformalSectorEmplymentYouthZambia
Done_Kaemba1991InformalSectorEmplymentYouthZambiaDone_Kaemba1991InformalSectorEmplymentYouthZambia
Done_Kaemba1991InformalSectorEmplymentYouthZambia
 
final ppt
final pptfinal ppt
final ppt
 
02 Aug 2012 - NREGA impact - Labor Market Effects of Social Programs
02 Aug 2012 - NREGA impact - Labor Market Effects of Social Programs02 Aug 2012 - NREGA impact - Labor Market Effects of Social Programs
02 Aug 2012 - NREGA impact - Labor Market Effects of Social Programs
 
NAWB Strengthening LMI Connections
NAWB Strengthening LMI ConnectionsNAWB Strengthening LMI Connections
NAWB Strengthening LMI Connections
 
Case Study on a Digital Marketing & SEO Company (Biswadeep Ghosh Hazra) - [P...
Case Study on a Digital Marketing & SEO Company (Biswadeep Ghosh Hazra) -  [P...Case Study on a Digital Marketing & SEO Company (Biswadeep Ghosh Hazra) -  [P...
Case Study on a Digital Marketing & SEO Company (Biswadeep Ghosh Hazra) - [P...
 
Slideshare pro poor growth
Slideshare pro poor growthSlideshare pro poor growth
Slideshare pro poor growth
 

Andere mochten auch

How to Make a Lavamanos-RoundBASE
How to Make a Lavamanos-RoundBASEHow to Make a Lavamanos-RoundBASE
How to Make a Lavamanos-RoundBASELynn Roberts
 
How to use database component using stored procedure call
How to use database component using stored procedure callHow to use database component using stored procedure call
How to use database component using stored procedure callprathyusha vadla
 
Proceedings of-the-waste-safe-2015 (1)
Proceedings of-the-waste-safe-2015 (1)Proceedings of-the-waste-safe-2015 (1)
Proceedings of-the-waste-safe-2015 (1)Mushfiqur Rahman
 
Foodies April16 v3
Foodies April16 v3Foodies April16 v3
Foodies April16 v3Sahiri Loing
 
прислівник
прислівникприслівник
прислівникMariya Yudina
 
How to use cache scope component
How to use cache scope componentHow to use cache scope component
How to use cache scope componentprathyusha vadla
 
Materiales en la informatica
Materiales en la informaticaMateriales en la informatica
Materiales en la informaticaMarioVaz Vazquez
 
Team Pique UHC Case Comp Slide Deck
Team Pique UHC Case Comp Slide DeckTeam Pique UHC Case Comp Slide Deck
Team Pique UHC Case Comp Slide DeckAnukriti Kurria
 
Claudia_Testanera_gennaio_2016
Claudia_Testanera_gennaio_2016Claudia_Testanera_gennaio_2016
Claudia_Testanera_gennaio_2016Claudia Testanera
 

Andere mochten auch (17)

How to Make a Lavamanos-RoundBASE
How to Make a Lavamanos-RoundBASEHow to Make a Lavamanos-RoundBASE
How to Make a Lavamanos-RoundBASE
 
How to use database component using stored procedure call
How to use database component using stored procedure callHow to use database component using stored procedure call
How to use database component using stored procedure call
 
Ancillary Development
Ancillary Development Ancillary Development
Ancillary Development
 
Question 5
Question 5Question 5
Question 5
 
Proceedings of-the-waste-safe-2015 (1)
Proceedings of-the-waste-safe-2015 (1)Proceedings of-the-waste-safe-2015 (1)
Proceedings of-the-waste-safe-2015 (1)
 
The production schedule
The production scheduleThe production schedule
The production schedule
 
Foodies April16 v3
Foodies April16 v3Foodies April16 v3
Foodies April16 v3
 
Man and van bournemouth
Man and van bournemouthMan and van bournemouth
Man and van bournemouth
 
Question 5
Question 5Question 5
Question 5
 
Question 2
Question 2Question 2
Question 2
 
Question 5
Question 5Question 5
Question 5
 
прислівник
прислівникприслівник
прислівник
 
How to use cache scope component
How to use cache scope componentHow to use cache scope component
How to use cache scope component
 
Materiales en la informatica
Materiales en la informaticaMateriales en la informatica
Materiales en la informatica
 
Construction of digipak
Construction of digipakConstruction of digipak
Construction of digipak
 
Team Pique UHC Case Comp Slide Deck
Team Pique UHC Case Comp Slide DeckTeam Pique UHC Case Comp Slide Deck
Team Pique UHC Case Comp Slide Deck
 
Claudia_Testanera_gennaio_2016
Claudia_Testanera_gennaio_2016Claudia_Testanera_gennaio_2016
Claudia_Testanera_gennaio_2016
 

Ähnlich wie Chay Stockdale & Kevin Rodrigues_Determinants and Distribution of the South African Labour Market Income, Evidence from the South African National Income Dynamic Study

Final Presentation- Tahsina Mame
Final Presentation- Tahsina MameFinal Presentation- Tahsina Mame
Final Presentation- Tahsina MameNadia Ayman
 
IssaPopulation and SamplingThe constructs of population and sa.docx
IssaPopulation and SamplingThe constructs of population and sa.docxIssaPopulation and SamplingThe constructs of population and sa.docx
IssaPopulation and SamplingThe constructs of population and sa.docxvrickens
 
Apresentação diana sawyer nec rio
Apresentação diana sawyer nec rioApresentação diana sawyer nec rio
Apresentação diana sawyer nec rioUNDP Policy Centre
 
BRM-Final report Income’s Effect On Expenditure
BRM-Final report Income’s Effect On ExpenditureBRM-Final report Income’s Effect On Expenditure
BRM-Final report Income’s Effect On ExpenditureEssam Imtiaz
 
Awareness and practices of family planning in the wa municipality
Awareness and practices of family planning in the wa municipalityAwareness and practices of family planning in the wa municipality
Awareness and practices of family planning in the wa municipalityAlexander Decker
 
Estimating Migration Flows Using Online Search Data - Project Overview
Estimating Migration Flows Using Online Search Data - Project Overview Estimating Migration Flows Using Online Search Data - Project Overview
Estimating Migration Flows Using Online Search Data - Project Overview UN Global Pulse
 
Understanding The Environment Of Demographics
Understanding The Environment Of DemographicsUnderstanding The Environment Of Demographics
Understanding The Environment Of DemographicsChristina Valadez
 
Canadian Journal of Economics Revue canadienne d’économique.docx
Canadian Journal of Economics  Revue canadienne d’économique.docxCanadian Journal of Economics  Revue canadienne d’économique.docx
Canadian Journal of Economics Revue canadienne d’économique.docxdewhirstichabod
 
Investigating elements of a population, poverty, and reproductive health rese...
Investigating elements of a population, poverty, and reproductive health rese...Investigating elements of a population, poverty, and reproductive health rese...
Investigating elements of a population, poverty, and reproductive health rese...The Population and Poverty Research Network
 
Implementing An Inventory Tracking System
Implementing An Inventory Tracking SystemImplementing An Inventory Tracking System
Implementing An Inventory Tracking SystemAshley Davis
 
Math Week 4 Case Study
Math Week 4 Case StudyMath Week 4 Case Study
Math Week 4 Case StudyJulie Brown
 
HLEG thematic workshop on Measurement of Well Being and Development in Africa...
HLEG thematic workshop on Measurement of Well Being and Development in Africa...HLEG thematic workshop on Measurement of Well Being and Development in Africa...
HLEG thematic workshop on Measurement of Well Being and Development in Africa...StatsCommunications
 
Female Community Health Volunteers in Nepal: What We Know and Steps Going For...
Female Community Health Volunteers in Nepal: What We Know and Steps Going For...Female Community Health Volunteers in Nepal: What We Know and Steps Going For...
Female Community Health Volunteers in Nepal: What We Know and Steps Going For...JSI
 
Anthony Orji_2023 AGRODEP Annual Conference
Anthony Orji_2023 AGRODEP Annual ConferenceAnthony Orji_2023 AGRODEP Annual Conference
Anthony Orji_2023 AGRODEP Annual ConferenceAKADEMIYA2063
 
Trang Hoang_Research Paper
Trang Hoang_Research PaperTrang Hoang_Research Paper
Trang Hoang_Research PaperTrang Hoang
 
Education and Earnings Inequality in Cameroon
Education and Earnings Inequality in CameroonEducation and Earnings Inequality in Cameroon
Education and Earnings Inequality in CameroonAJHSSR Journal
 
UNU WIDER Conf Daidone 1
UNU WIDER Conf Daidone 1UNU WIDER Conf Daidone 1
UNU WIDER Conf Daidone 1Gean Spektor
 

Ähnlich wie Chay Stockdale & Kevin Rodrigues_Determinants and Distribution of the South African Labour Market Income, Evidence from the South African National Income Dynamic Study (20)

Final Presentation- Tahsina Mame
Final Presentation- Tahsina MameFinal Presentation- Tahsina Mame
Final Presentation- Tahsina Mame
 
ZeroHours
ZeroHoursZeroHours
ZeroHours
 
IssaPopulation and SamplingThe constructs of population and sa.docx
IssaPopulation and SamplingThe constructs of population and sa.docxIssaPopulation and SamplingThe constructs of population and sa.docx
IssaPopulation and SamplingThe constructs of population and sa.docx
 
Apresentação diana sawyer nec rio
Apresentação diana sawyer nec rioApresentação diana sawyer nec rio
Apresentação diana sawyer nec rio
 
BRM-Final report Income’s Effect On Expenditure
BRM-Final report Income’s Effect On ExpenditureBRM-Final report Income’s Effect On Expenditure
BRM-Final report Income’s Effect On Expenditure
 
Awareness and practices of family planning in the wa municipality
Awareness and practices of family planning in the wa municipalityAwareness and practices of family planning in the wa municipality
Awareness and practices of family planning in the wa municipality
 
Estimating Migration Flows Using Online Search Data - Project Overview
Estimating Migration Flows Using Online Search Data - Project Overview Estimating Migration Flows Using Online Search Data - Project Overview
Estimating Migration Flows Using Online Search Data - Project Overview
 
Structural Determinants of Poverty in Pakistan
Structural Determinants of Poverty in PakistanStructural Determinants of Poverty in Pakistan
Structural Determinants of Poverty in Pakistan
 
Understanding The Environment Of Demographics
Understanding The Environment Of DemographicsUnderstanding The Environment Of Demographics
Understanding The Environment Of Demographics
 
Canadian Journal of Economics Revue canadienne d’économique.docx
Canadian Journal of Economics  Revue canadienne d’économique.docxCanadian Journal of Economics  Revue canadienne d’économique.docx
Canadian Journal of Economics Revue canadienne d’économique.docx
 
Investigating elements of a population, poverty, and reproductive health rese...
Investigating elements of a population, poverty, and reproductive health rese...Investigating elements of a population, poverty, and reproductive health rese...
Investigating elements of a population, poverty, and reproductive health rese...
 
ppt37.ppt
ppt37.pptppt37.ppt
ppt37.ppt
 
Implementing An Inventory Tracking System
Implementing An Inventory Tracking SystemImplementing An Inventory Tracking System
Implementing An Inventory Tracking System
 
Math Week 4 Case Study
Math Week 4 Case StudyMath Week 4 Case Study
Math Week 4 Case Study
 
HLEG thematic workshop on Measurement of Well Being and Development in Africa...
HLEG thematic workshop on Measurement of Well Being and Development in Africa...HLEG thematic workshop on Measurement of Well Being and Development in Africa...
HLEG thematic workshop on Measurement of Well Being and Development in Africa...
 
Female Community Health Volunteers in Nepal: What We Know and Steps Going For...
Female Community Health Volunteers in Nepal: What We Know and Steps Going For...Female Community Health Volunteers in Nepal: What We Know and Steps Going For...
Female Community Health Volunteers in Nepal: What We Know and Steps Going For...
 
Anthony Orji_2023 AGRODEP Annual Conference
Anthony Orji_2023 AGRODEP Annual ConferenceAnthony Orji_2023 AGRODEP Annual Conference
Anthony Orji_2023 AGRODEP Annual Conference
 
Trang Hoang_Research Paper
Trang Hoang_Research PaperTrang Hoang_Research Paper
Trang Hoang_Research Paper
 
Education and Earnings Inequality in Cameroon
Education and Earnings Inequality in CameroonEducation and Earnings Inequality in Cameroon
Education and Earnings Inequality in Cameroon
 
UNU WIDER Conf Daidone 1
UNU WIDER Conf Daidone 1UNU WIDER Conf Daidone 1
UNU WIDER Conf Daidone 1
 

Chay Stockdale & Kevin Rodrigues_Determinants and Distribution of the South African Labour Market Income, Evidence from the South African National Income Dynamic Study

  • 1. 1 Determinants and Distribution of the South African Labour Market Income, Evidence from the South African National Income Dynamic Study Kevin Rodrigues (RDRKEV001) & Chay Stockdale (STCCHA002) University of Cape Town Abstract: Inequality is a major problem in South Africa with the South African Gini coefficient estimated to be 0.65 in 2014 (World Bank, 2015). This paper investigates the distribution and determinants of labour market income, so that emphasis with regards to policy recommendation can be placed on significant determinants to reduce income inequality. The paper estimates the effect of 16 independent variables, 12 of which are categorical, on a labour market income ordinary least squares multiple linear regression model. Wave 3 of the National Income Dynamics Study is used as a sample of the South African population. A number of variables were found to have a significant impact on labour market income with education level having the largest individual impact. Other significant variables included; age, tenure, occupation, sector, province, geographic area, race, gender, union membership, average hours, health status, marital status and English writing ability. It is recommended that policy interventions be focused on improving the education level among other recommendations that are likely to improve incomes of individuals living in South Africa. Keywords: Income Determinants; Income Distribution; NIDS Wave 3, South Africa Introduction: Inequality is a major problem in South Africa, with South Africa’s Gini Coefficient estimated to be 0.65 (World Bank, 2015). It is important to understand the distribution and determinants
  • 2. 2 of labour market income, within the country, such that income gaps can be closed by focusing on improving wages through these determinants. This paper seeks to evaluate the determinants and distribution of individual labour market income among the Employed Economically Active South African Population (excluding the self-employed), using the National Income Dynamics Study (NIDS) wave 3, 2012, data. The paper initially details a literature review, describing relevant past research on the topic. From this a description of the data used and manipulation of the data is provided. This paper then draws on past research to clarify the selection of the dependent and independent variables of the model, and provides expectations of the relationships. Descriptive statistics are provided for the variables, showing an analysis of the data and underlying population through various summary statistics, pie graphs and cross-tabulations. The regression procedure and regression results are then provided, with comments for each independent variable in comparison to prior expectations. From this policy and intervention recommendations targeting the most significant determinants of labour market income are provided. The paper finally ends with a regression diagnostic, testing whether the assumptions of multiple linear regression held in the model. Literature Review: Human Capital Theory describes factors that determine income in the labour market. Many studies have been performed in an attempt to capture the determinants and distribution of income. Given the legacy of apartheid, most South African studies in this area focus on income inequality amongst race and gender. International research provides insight and offers empirical evidence supporting other determinants of income. Most models isolating the determinants of income are an expansion of Mincer’s (1974) earnings function, which premised earnings as a positive function of education and work experience. Work experience is expressed as a quadratic function in Mincer’s earnings function, this was an adjustment to prior studies that showed that earnings was a quadratic function of age (Mincer, 1974:46). By making work experience quadratic Mincer (1972:46) was able to capture the earnings-age relationship. Education is linearly related to earnings, as Mincer (1974:47) observed that the profiles for work experience to earnings were approximately parallel for different education levels. The model has been proven to hold through extensive regressions. A regression verifying Mincer’s model was performed on the
  • 3. 3 South African 2005 and 2006 Income and Expenditure Survey data (Botha, 2010:143). The outcome of the regression showed that earnings were significantly and positively related to work experience and education as theorised (Botha, 2010:143). UK research performed by Miles (1997:23), on micro household data, reconfirms the education and work experience variables, as significant determinants of income, and expands the function to demonstrate that household income significantly varies according to the household’s geographical location and gender of the head of the household. His research reported that a female head of household earned on average lower than a male head, and a person's earnings varied according to the city a person lived in (Miles, 1997:23). In South Africa, it was discovered that a household that is within an urban area and headed by a white male is less likely, than any other geographical or demographic classification, to live below the poverty line (Botha, 2010:142). The statistics based on the IES for 2010 and 2011 showed that the average annual household income for African Blacks is R69 632, compared to the white population group that has an average of R387 011 (Statistics South Africa, 2012:11). The white population’s share of the total income exceeds what it should be according to group’s population size (Statistics South Africa, 2012:12). The Gini coefficient for South Africa was last measured in 2014 at 0.65, compared to Brazil’s at 0.53 (World Bank, 2015). Gini coefficient of 1 represents perfect inequality. Considering that this income inequality is portrayed through race in South Africa (Statistics South Africa, 2012:12). Race can be considered a determinant of income. Research performed by Van den Berg and Louw (2003:19) found that income is unequally distributed both within and between race groups in South Africa. While rising black incomes have contracted the inter-racial income gap, the rising intra-racial inequality has caused an increase in the Gini coefficient (Van den Berg and Louw, 2003:19). This finding of race as a significant determinant of income has been confirmed by The South African Reserve Bank’s 2006 report on the determinants of public and private-sector wages in South Africa. (Bosch, 2006:9) Bosch (2006:22) found that an individual's industry and sector, defined as either private or public, significantly influenced their income. While individuals generally earn higher wages in the private sector, the public sector showed a fairer distribution across all wages with a lower maximum wage (Bosch, 2006:24). The report revealed that wages were distributed
  • 4. 4 more fairly across gender and population group in the public sector (Bosch, 2006:24). This could be due to the public sector embodying equity policy (Bosch, 2006:24). In the same report Bosch found that job profession and union membership significantly affect South African earnings (Bosch, 2006:9). It was found that those in higher skilled professions and those who were union members had greater earnings figures (Bosch, 2006:9). Daniels and Rospabé (2005:12) confirm these findings about job profession and union membership. In addition to the above findings on job profession, union membership, education attainment, work experience, race, age, gender, geographical location and industry, Daniels and Rospabé (2005:11) found that marriage, household headship, nature of employment activity (formal or informal) and monthly hours worked significantly affect an individual’s wage. By performing a regression analysis on the 1999 Statistics South Africa October Household Survey data they found that on average married people earn significantly more than those that are not married (Daniels and Rospabé, 2005:11). On average those that are the head of a household earn significantly higher wages that those that are not (Daniels and Rospabé, 2005:11). Individuals involved in formal economic activities earn significantly higher income than those involved in informal activity. And finally an individual’s monthly hours worked significantly affects their income earned (Daniels and Rospabé, 2005:11). Data and Methods: Data Description This study uses Wave 3 of the National Income Dynamics Study (NIDS) to investigate the distribution and determinants of labour market income in South Africa. The study is limited to the employed, economically active population, excluding those that are self-employed. NIDS Wave 3 Data The NIDS survey is a face-to-face longitudinal study of individuals in South Africa and the people living in their households. It is a combination of multiple individual and household level questionnaires covering various aspects of the individuals and households including, amongst other things, their demographics, household characteristics, income and employment information, health status and education attainment. The NIDS Wave 3 Data is the latest
  • 5. 5 cross section of the individuals followed throughout the NIDS study collected in 2012 (Southern Africa Labour and Development Research Unit, 2013:2). Quality of the NIDS Wave 3 Data The quality of the data that underpins the analysis was integral for accurate determination of the independent variables. The NIDS Wave 3 provides a board range of household and individual data. A major reason for the survey was to measure income mobility within South Africa and thus much of data regarding earnings was relevant to the analysis. The Wave 3 cross sectional data consists of the 8040 successful household interviews and 32 633 successful individual interviews (Southern Africa Labour and Development Research Unit, 2013:4). After preparing the data for the regression analysis in this particular study there was 6210 observations left. The data was collected on a survey designed with the intention to be nationally representative, and thus relevant for defining the determinants of labour market income for South Africa. Data Manipulation The Wave 3 NIDS was manipulated using the statistical package STATA 14. This study is limited to the employed, economically active population, employees only, excluding the self- employed. Preparation and Manipulation Process of Data 1. Data Merging: All separate databases contained in the NIDS Wave 3 study were merged into one database before manipulations. 2. Removal of observations that were not successfully collected: All observations in the NIDS Wave 3 database that were not successfully collected have been deleted, only those surveys, which were successfully collected will be used in our analysis. This solves many problems including removing those people that are deceased and removing people that are members of two households, as only one survey will be
  • 6. 6 successfully completed per person and mapped to the household they were in when the survey was completed. 3. Removing all people below 15 years of age and above 65 years of age: This study is limited to the economically active population, as defined by the Department of Labour of South Africa, this excludes all people below 15 and above 65 (Department of Labour, 2015). Thus all people below the age of 15 and above the age of 65 have been removed from the sample considered. 4. Deleting all discouraged, unemployed and not economically active individuals: This study is limited to those that are employed, excluding the self-employed. Thus all observations that are identified as unemployed, discouraged and not economically active have been removed. 5. Removing Child and Proxy data: The child and proxy questionnaires used in the NIDS wave 3 study have insufficient information relating to employment and employment related factors to perform a consistent comprehensive analysis on the determinants of labour market income. These observations have thus been removed from the sample. 6. Construction of a Labour Market Income variable: The labour market income variable has been constructed by using an individual variables (w3_fwag) used by NIDS in their aggregation of household labour market income. w3_fwag is the wage from a person’s primary and secondary occupations. This variable is collated across all respondents. Should a response be a bracket response the mid-point of the bracket has been taken. For those that indicated that they received income above the highest band, double the upper value of the highest band has been used for their income. The 4.97% missing values have been imputed using regression. The labour market income variable was derived by taking w3_fwag and subtracting net secondary occupation income (em2pay) greater than zero. The income values have been deflated such that they are recorded in August 2012 monetary terms.
  • 7. 7 7. Deleting all observations where labour market income is zero: It has been assumed that all those that have a labour market income of zero are either self- employed or not economically active, yet incorrectly recorded as being economically active. They thus do not form a part of this analysis. All observations with a labour market income of 0 have been deleted. 8. Construction of a Tenure variable: To determine tenure the start date in months and years in current occupation was subtracted from the date of the NIDS interview in months and years. Negative values, meaning that the individual was starting employment after the interview date, were removed. 9. Cleaning and Categorizing Variables: Following this the prospective independent variables were cleaned and categorized for use in the analysis. 10. Weighting To assure that the inference drawn from this study is applicable in the national context the NIDS wave 3 survey data collected was weighted using post-stratified sampled weights. Following the cleaning and manipulation of the NIDS Wave 3 survey data the original sample size was reduced to 6210 observations. Dependent Variable The labour market income variable constructed reflects a monthly income from the primary occupation of an individual. The distribution of the labour market income variable was heavily skewed and non-normal. This skewedness in income is confirmed by Daniels and Rospabé’s (2005:6) findings. The best transformation, approximating a normal distribution, of labour market income was the natural log of the variable. The dependent variable used in this analysis is the natural log of the labour market income.
  • 8. 8 Independent Variables Each independent variable chosen was supported by theoretical reasoning and empirical evidence from prior research. Age Mincer’s (1974:46) original equation theorised age as a proxy for work experience, as age increases so does work experience and thus income. Age-earnings profiles empirically support a significant positive relationship between age and earnings (Murphy, 1990:202). Age squared The age squared term captures the effect of age on an individual’s productivity. Beyond a certain age productivity is expected to decline. The age squared term is expected to have a negative relationship with income. This reasoning is similar to the reasoning Mincer (1974:50) provided for the work experience squared term, which involved the over- specialisation of an individual such that he becomes less employable. This reasoning is empirically supported by regressions on age-earnings profiles, which show that the age squared term significantly reduces income (Murphy, 1990:203). Education Mincer’s (1974:46) earnings function expressed a positive linear relationship between education and income. Higher education is theorised to allow an individual to perform more complicated tasks, which are rewarded with a higher income. Education’s positive impact on income is empirically supported by Daniels and Rospabé (2005). For the purposes of this study, education years have been recoded and categorized into no schooling, some/complete primary schooling, some secondary schooling or secondary equivalent, matric or equivalent, diploma, undergraduate degree and postgraduate qualification. The base category used was no schooling
  • 9. 9 Tenure The longer an employee is with a company the more likely the employee becomes part of the core operations of the business, and hence the more valuable to the company. In return the employee’s salary should increase. This reasoning supports the inclusion of the tenure variable into the model. Consistent with the findings of Daniels and Rospabé (2005) it is expected that as tenure increases so does labour market income. Average Hours Worked per week The more hours an individual works in a week the greater his salary/wage should be for the week, if paid overtime or on an hourly basis. This simply reasoning warranted the inclusion of this variable. Consistent with the findings of Daniels and Rospabé (2005) it is expected that as average hours worked increases so does labour market income. Union Membership Being a union member should increase income through the wage bargaining process. Consistent with the findings of Bosch (2006:23) union membership is expected to have a significant positive impact on income. The base category in the following analysis is that an individual is not a union member. Occupation Occupations that require more complex skills should pay higher salaries. Bosch (2006) finds that occupations that require more skill tend to pay larger salaries. The base occupation category in this analysis is elementary occupations, which, requiring the lowest skill level are expected to have the lowest income relative to other occupations. Sector It is expected that certain sector’s pay premium wages. This is evident in the minimum sectoral wages, for instance, the agricultural minimum sector wage is lower than the minimum sector wage for the mining sector (Bosch, 2006:20). Bosch (2006) finds that there are sectors which pay wage premiums resulting in greater income for the individual employed in those sectors. The following analysis uses the private household sector as the
  • 10. 10 base category, it is expected that this sector pays the lowest premium wages. It is expected that the manufacturing and transportation sector pay the highest wage premium, as these sectors have the highest minimum sectoral wages (Benhura and Gwatidzo, 2013:9). Gender According to 2014 tax statistics, females earn on average a third of what males earn (South African Revenue Services, 2014:9). Bosch (2006) supports this, finding that female workers on average earn lower wages, in line with this it is expected that females will earn less than males. The base category for the following analysis was female. Race Due to the legacy of apartheid, inequality in South Africa often manifests on racial lines (Statistics South Africa, 2012:12). Van den Berg and Louw (2003) find race to be a relevant factor in determining income in South Africa. The African race is used as the base category in the analysis. It is expected that an individual belonging to the White race category will earn more labour market income than any other category, due to this race being the benefactors of apartheid. Marital Status An individual who is married is expected to have more responsibilities and commitments than a single individual and therefore has a greater incentive to earn more income. Consistent with Daniels and Rospabé (2005) it is expected that married individuals earn more than single individuals. This variable has been categorised into married and unmarried, the base category is unmarried. Head of Household An individual who is the head of household has more responsibilities than an individual who is not, and therefore has a greater incentive to earn more income. In line with Miles (1997) household headship is expected to be a significant determinant of income. The base category in this analysis is that an individual is not the head of a household.
  • 11. 11 Health Status The sicker an individual is the less likely the individual will be willing and able to work and the less reliable they will be as an employee. This should reduce their labour market income. Health status is categorised into two categories: good to excellent and fair to poor. The base category for this analysis is poor to fair health. Geographic Area In line with the 2012 Income and Expenditure of Households report by Statistics South Africa it is expected that the population living in rural areas earns on average less than those in urban areas. The base category used for this variable is rural. Province It is expected that provinces with higher GDP’s have a greater amount of economic activity occurring in them, resulting in premium wages being paid in these provinces. Daniels and Rospabé (2005) find that province has a significant impact on earnings. The base category for the following analysis is the Eastern Cape, as this province has a relatively low GDP and had the lowest income levels in 2013 (Statistics South Africa, 2013:46). It is expected that individuals living the Gauteng Province and Western Cape Province will earn the highest labour market incomes as these two provinces have the highest GDP levels and income levels, and are economic hubs of South Africa (Statistics South Africa, 2013:46). English Writing Level English is the most common language used in commercial South Africa. It this hypothesised that an individual with a greater proficiency in English will be open to better opportunities and thus earn higher income. The base category for this analysis is no English writing abilities.
  • 12. 12 Descriptive Statistics: Distribution of Labour Market Income Table 1: Summary Statistics for Labour Market Income Summary Statistics of Labour Market Income Observations 6,181 Percentiles 1% 200 Population Weight of Observations 12 973 193 5% 501.0225 10% 812.4787 Mean 5514.503 25% 1585.44 50% 3100 Standard Deviation 7421.482 75% 6351.852 90% 11890.8 Variance 55 100 000 95% 18092.31 99% 36329.06 Skewness 4.348522 Kurtosis 34.85887 Figure 1: Histogram showing the distribution ofLabour Market Income in South Africa
  • 13. 13 Variable Observations Weight Mean Standard Deviation Minimum Maximum Mode Age 6,181 12973193.00 37.06728 10.4499 15 64 32 Education Years 6,173 12964349.40 10.46122 3.225948 0 18 12 Race 6,181 12973193.00 African Gender 6,181 12973193.00 Male Occupation 5,384 11575098.90 Elementery Occupations Sector 5,283 11272926.30 Community Social and Personal Services Summary Statistics of Age, Education, Race, Gender, Occupation and Sector The above histogram, Figure 1, gives us a picture of the distribution of labour market income in the South African population. Only people with incomes below R100 000 have been plotted on the histogram to enhance readability. It is clear that the South African population has a highly unequal distribution of income. This is confirmed by World Bank’s estimated Gini coefficient of 0.65 stated in the literature review (2015). The distribution of income in South Africa is heavily skewed to the right. The monthly labour market income of the South African population is highly concentrated around the peak in the distribution; with the majority of people earning less than R10 000 per month income. This is further informed by the summary statistics in Table 1 above. The skewness of 4.35 deviates widely from the skewness of 0 of the normal distribution, confirming the skewness apparent in the graph to the right. The kurtosis of 34.85 is an indication of how clustered the observations of South African labour market income are around the median value of R3 100. 50% of all employed economically active South Africans (excluding the self-employed) earn between R6352 and R1 585. To further illustrate how clustered monthly income values are below the value of R10 000, 90% of all employed economically active South Africans earn below R11 890. Table 2, which follows illustrates the summary statistics of the explanatory variables of labour market income. Summary Statistics of population Table 2: Summary Statistics of Age, Education Years, Race, Gender, Occupation and Sector Age The minimum and maximum age for the population is 15 years and 64 years respectively. The sample was cleaned to remove all people below 15 and over 64, as the formal definition for the economically active population includes people from 15 to 64 (Department of Labour,
  • 14. 14 Race Group Frequency Percent Cumulative 1. African 4,499.9077 72.8 72.8 2. Coloured 680.609865 11.01 83.81 3. Asian/Indian 193.057955 3.12 86.94 4. White 807.424465 13.06 100 Total 6,181 100 2015). The average age for the employed economically active population is 37.07 years. With a standard deviation of 10.45 years. The most common age within the population is 32 years. Education years The minimum education years is 0 meaning a person has no schooling. The maximum is 18 years meaning that a person either has a Master's degree or Doctorate. The average number of years of education in the population is 10.46 years with a standard deviation of 3.23 years. The average of 10.46 years is approximately equivalent to Grade 10 school qualification. The most common amount of years of education, within the population, is 12 years, which is equivalent to a matric. After determining the summary statistics in years the education variable was categorised into; no schooling, some/complete primary schooling, some secondary schooling or secondary equivalent, matric or equivalent, diploma, undergraduate degree and postgraduate qualification. Race Race is defined as, 1. African, 2. Coloured, 3. Asian/Indian and 4. White. The mean, minimum and maximum do not provide meaningful information, as they are based on the number demarcating the race outcome, and have been remove from the summary statistics. The mode, illustrates that the largest race population, in the overall population, is African. Table 2.1: Summary Statistics for Race Table 2.1, illustrates the group percentages according to race. The table shows that 72.8% of the population is African, which is approximate to Statistics South Africa’s 2015 mid-year results that had the African race at 80.5% of the population. The Asian/Indian race is group is the smallest race group only representing 3.12% of the population.
  • 15. 15 Gender The mode shows that the population consists of more males than females. Table 2.2: Summary Statistics for Gender Table 2.2, illustrates that 57.63% of people in the considered population are male while 42.37% of people in the population are female. Occupation The mode shows that the most common occupation, within the population, is elementary occupations. Elementary occupations is defined as, paid jobs that require the completion of simple routinely tasks, which could require the use of hand tools or physical exertion (International Labour Organisation, 2015). Table 2.3: Summary Statistics for Occupation Table 2.3, illustrates that the majority of people in our sample are involved in elementary occupations at 25.89%. This in line with the mean and mode for education of years, as elementary occupations require limited or no education. A European paper suggests that as educational attainment increases, the majority of the population should be occupations that require higher education, furthermore as technology increases less workers will be required in
  • 16. 16 Sector Frequency Percent Cumulative 0. Private households 504.885079 9.56 9.56 1. Agriculture,hunting, forestry and fishery 379.136345 7.18 16.73 2. Mining and Quarrying 390.284465 7.39 24.12 3. Manufacturing 515.741469 9.76 33.88 4. Electricity, gas and water supply 162.799577 3.08 36.96 5. Construction 323.03887 6.11 43.08 6. Wholesale and Retail trade; repair etc 388.817069 7.36 50.44 7. Transport, storage and communication 405.4604513 7.67 58.11 8. Financial intermediation, insurance, etc 338.959187 6.42 64.53 9. Community, social and personal services 1,676.1041 31.73 96.26 10. Catering and accommodation 197.77335 3.74 100.00 Total 5,283 100 elementary occupations, but those workers would require higher education (CEDEFOP, 2012:10). The skilled agriculture, forestry and fishery occupation consists of the smallest percentage at 0.28%, which is consistent with the relative slow mechanisation in South Africa in this sector (Aliber & Simbi, 2000:2). Sector The mode shows that the most common sector to work in, within the in population, is the community, social, and personal services sector. Table 2.4: Tabulation of Sector variable Table 2.4, illustrates that 31.73% of the population works in the community, social and personal service sector, this represents the largest concentration of the population according to sector. The South African Reserve Bank show that 8.8% of the total wage bill for the private sector was attributable to the community, social and personal services sector (Bosch, 2016:19). The smallest sector is electricity, gas and water supply sector, which employs 3.08% of the population.
  • 17. 17 Area Mean Standard Deviation Frequency Population Proportion Observations Traditional 3416.5542 4134.0827 2 424 755.00 18.69% 1,524 Urban 6277.1992 8142.2836 9 670 880.00 74.55% 3,911 Farms 2906.2061 3337.3939 877 558.00 6.76% 746 Total 5514.503 7421.4825 12 973 193.00 100.00% 6,181 Summary Statistics of Labour Market Income by Geographic Area Type Analysis of the income by covariates Geographical area type Table 3: Summary Statistics of Labour Market Income by Geographic Area Type Table 3, illustrates the average monthly income earned by the portion of the population living in traditional areas, urban areas, and farm areas is R3416.55, R6277.20 and R2906.21 respectively. People living urban areas earn on average the highest monthly income. People living in farm areas earn on average the lowest monthly income. The difference between the urban and farm areas average monthly income is R3370.99, which is approximately equal to the average monthly income earned in traditional areas. The standard deviation for urban areas is R8142.28 and has the greatest dispersion in labour market income. The standard deviation for traditional and farm areas is R4134.08 and R3337.39. Figure 2: Pie Graph of Monthly Labour Market Income across Geographic Areas
  • 18. 18 Race Mean Standard Deviation Frequency Population Proportion Observations African 4115.7063 5176.7689 9444778 72.80% 4,518 Colour 4560.2441 5628.6571 1428520 11.01% 1,314 Asian/Indian 11183.206 12048.051 405206 3.12% 88 White 12759.201 11829.756 1694689 13.06% 261 Total 5514.503 7421.4825 12 973 193.00 100.00% 6,181 Summary Statistics of Labour Market Income by Race Group Figure 2, above illustrates that 84.86% of the total monthly income is earned in urban areas, despite that only 74.55% of the population lives in urban areas. Figure 2, further illustrates that 11.58% of total monthly income is earned in traditional areas and 3.56% is earned in farm areas. Even though 18.69% of population live in traditional areas and the remaining 6.76% live in farm works. This means that people that live in urban earn on average more than those in traditional and farm areas, which is illustrated by the means in table 3. Following this traditional and farm areas were combined to form rural areas, due to their similarity. Race Groups Table 4: Summary Statistics of Labour Market Income by Race Group Table 4, illustrates the average monthly income earned according to race populations. African, Coloured, Asian/Indian and White population groups earn a monthly average of R4115.71, R4560.24, R11183.21, and R12759.20 respectively. The White group has the highest average closely followed by the Asian/Indian group. The African group has the lowest average. The Coloured average is R444.53 above the African average. The difference between the White and African average is R8643.50, which is approximately equal to the sum of the average monthly income earned by the African and Coloured groups. The Asian/Indian group has the highest standard deviation of R12048.05, meaning this group has the greatest dispersion in monthly income. The White group also has a high standard deviation of R11829.76, compared to the African and Coloured groups that have a standard deviation of R5176.77 and R5628.66. Therefore, the African and Coloured groups have a relatively low dispersion in monthly income.
  • 19. 19 Figure 3: Pie Graph of Monthly Labour Market Income across Race Group Figure 3, above illustrates that 54.34% of the total monthly income is earned by the African population group, despite that 72.80% of the population is African. Figure 3, further illustrates that 30.22% of the total monthly income is earned by the White population, 9.11% by the Coloured population and 6.33% by the Asian/Indian population. Even though 13.06% of the population is White, 11.01% of the population is Coloured and the remaining 3.12% of the population is Asian/Indian. This means that White and Asian/Indian population groups earn more of the total monthly income than would be expected, according the size of the race group in respect to the total population. Gender Groups Table 5: Summary Statistics of Labour Market Income by Gender
  • 20. 20 Table 5, above provides evidence of discrepancies in income between different gender groups. The mean income for males is R6076.22, while that for females is R4 750.46. Males on average earn more than females and have a higher dispersion in their earnings, as they have a standard deviation of R8233.59, compared to the standard deviation for females of R6065.69. Figure 4: Pie Graph ofMonthly Labour Market Income across Genders As illustrated in Figure 4 above, males earn 63.5% of all monthly labour market income in South Africa, while females earn 34.5% of all monthly labour market income. This is despite the fact that the considered population consists of 57.63% males and 42.36% females. Males earn disproportionally more income that females.
  • 21. 21 Age Cohorts Table 6: Summary Statistics ofLabour Market Income by Age Cohort Table 6, above provides summary statistics of Labour market income by age cohort, from this table we can deduce a relationship between monthly labour market income and age. The lowest mean wage (R2 146.24) is earned by those within the 15-20 year age category, while the highest mean wage (R7 939.87) is earned by those in the 50-55 year age category. A person’s income tends to increase as they get older until they reach a peak income within the age band 50 to 55. Income seems to increase faster than it decreases with increases up to the peak income age cohort being between R800 and over R1000 and decreases being below R500. Age cohorts above 40 earn a higher mean monthly income than the population mean. Those below 40 tend to earn a lower monthly income than the population mean value. The dispersion of income also seems to increase as people get older, save for the initial two categories, 15-20 and 20-25 that deviate from the trend. Dispersion of income tends to increase from the age cohort 20-25 year with a standard deviation of R2 665.15 to a maximum of R12 095.09 in the 55-60 age cohort, this then decreases as with the mean wage. Age cohorts between 40 and 60 tend to have wages that are more dispersed than the wages of the entire population.
  • 22. 22 Figure 5: Pie Chart ofMonthly Labour Market Income by Age Cohort Figure 5, above indicates that despite individuals in the age cohort 50-55 years earning the highest mean income, this group only earns 6.519% (4th lowest proportion) of all monthly labour market income earned by the population. This is due to the fact that they occupy only 8.39% of the population under consideration as an age cohort, which is relatively smaller than the other age cohorts that can occupy up to 17.28% of the population under consideration. The cohorts that earn the top three proportions of all population monthly labour market income, in descending order, are age cohort 35-40, age cohort 40-45 and age cohort 30-35. These population groups form the middle income bracket considering their means but occupy much higher proportions of the population, 16.62%, 13.68% and 17.28% respectively. The cohorts that earn the bottom three proportions of all population monthly labour market income, with the lowest proportion holder first, are age cohort 15-20, age cohort 60+ (60-64) and age cohort 20-25. These population groups occupy 1.45%, 1.77% and 9.6% of the total population under consideration, respectively. The population cohorts 15-20 and 20-25 earn the lowest mean incomes as well.
  • 23. 23 Regression Method The effects of the above mentioned independent variables on log labour market income have been estimated through an ordinary least squares multiple linear regression. This estimation method estimates the effects of the independent variables on the log labour market income by fitting a line of best fit that minimises the sum of the squared variation between the log labour market income values predicted by the estimation line and the actual log labour market income values. Categorical independent variables in the regression model affect the intercept of the estimated regression line with reference to some base category, and numerical independent variables affect the slope of the predicted regression model. In order for ordinary least squares (OLS) multiple linear regression parameters to be unbiased - meaning that on average, if we take infinitely many samples from the same population and estimate the effect using this method, the estimated effect of the independent variable on the dependent variable in the sample is equal to the effect on the independent variable on the dependent variable in the population - four assumptions must hold. First, the population must be linear in parameters, in the underlying population model the dependent variable must be linearly related to the independent variables and some error term. Second, a random sample from the underlying population must be collected, each member of the population must have the same probability of being included in the sample. Third, there must be no perfect collinearity between independent variables, none of the independent variables must be constant and there must be no exact linear relationship among independent variables. Fourth, the expected value of the error term (u) - the factors other than the independent variables that affect the dependent variables - given any values of the independent variables must be equal to zero. In order for the estimated parameters to be considered the best linear unbiased estimators - estimators with the least variance when compared to all other linear unbiased estimators, of the underlying population parameters - a fifth assumption must hold. This assumption is the assumption of homoscedasticity, where the error term has the same variance given any value of the independent variables. For the estimated sample parameters to be considered minimum variance unbiased estimators - estimators with the smallest variance of all unbiased estimators - a sixth assumption must
  • 24. 24 hold. This is the assumption that the population error is independent of the explanatory variables and normally distributed with zero mean and variance of σ2. Assumptions 1-6 above are the classical linear model assumptions. This paper will proceed assuming that the classical linear model assumptions hold for the regression output and analysis section. These assumptions will then be tested. Model Estimation Following the initial regression of log labour market income on the above specified independent variables the initial estimated regression coefficients are obtained. These results are included in the appendix. To determine whether robust standard errors should be applied to this model, two diagnostics are run. The first, a Breusch-Pagan test for heteroscedasticity. Second, a residuals vs fitted graph is plotted, both to determine whether heteroscedasticity is present in the model. The results of the Breusch-Pagan test for heteroscedasticity, table 7 below, suggest that heteroscedasticity is not a problem with this model as we fail to reject the null hypothesis that the error term has a constant variance, at the 5% level. The Breusch Pagan test however only detects linear heteroscedasticity and is sensitive to violations in normality (Williams, 2015). A residuals vs fitted values plot is thus analysed to further test for heteroscedasticity. Table 7: Breusch-Pagan Test for Heteroskedasticity Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations) Observing the residuals vs fitted plot, in figure 6 below, we find that residuals for higher fitted values tend to be slightly lower than those for lower fitted values, implying heteroscedasticity. Thus assumption three, as specified above, of constant error variance is chi2 = 0.37 Prob > chi2 = 0.5451 Ho: Constant variance Variables: Fitted Values of Log Labour Market Income Breusch-Pagan / Cook-Weisberg Test for Heteroskedasticity
  • 25. 25 violated as residuals have a lower variance for higher values of x than for lower values of x. To account for the failure of this assumption and reduce the bias in the calculation of the standard errors of our estimated coefficients robust standard errors are applied in the estimation of our regression model. The final results of the regression model with robust, after the application of robust standard errors are listed in table 8 below. Figure 6: Residuals vs Fitted Plot Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations) -6-4-2 02 Residuals 6 7 8 9 10 11 Fitted values
  • 26. 26 Table 8: Final RegressionAnalysis Results with Robust Standard Errors Applied (1) VARIABLES Model 1 Constant 5.237*** (0.247) Household Headship 0.0179 (0.0345) Age 0.0325*** (0.0105) Age^2 -0.000332** (0.000131) Tenure 0.0172*** (0.00244) Average Hours Worked Per Week 0.00329*** (0.00119) Education Level(Base Categyory: No Schooling) Completed or Some Primary 0.0774 (0.0987) Some Secondary or Secondary Equivalent 0.216** (0.102) Matric or Matric Equivalent 0.522*** (0.108) Diploma 0.945*** (0.120) Degree 1.317*** (0.147) Postgraduate Degree 1.112*** (0.196) English Writing Level Proficiency(Base Category: Not at all) Very well 0.208*** (0.0798) Fair 0.0749 (0.0810) Not well -0.0227 (0.0777) Province (Base Category: Eastern Cape) Western Cape 0.201*** (0.0755) Northern Cape 0.176* (0.103) Free State 0.174** (0.0880) KwaZulu-Natal 0.187*** (0.0697) North West 0.350*** (0.0779) Gauteng 0.365*** (0.0677) Mpumalanga 0.457*** (0.0688) Limpopo 0.179** (0.0751)
  • 27. 27 (1) VARIABLES Model 1 Gender(Base Category:Female) Male 0.263*** (0.0371) Trade Union Membership (Base Category:No) Part of a Trade Union 0.217*** (0.0387) Primary Occupation (Base Category: Elementary Occupations) Managers 0.640*** (0.104) Professionals 0.423*** (0.0722) Technicians and associate professionals 0.426*** (0.0818) Clerical support workers 0.213** (0.102) Service and sales workers 0.165*** (0.0453) Skilled agricultural, forestry and fishery workers 0.570*** (0.118) Craft and related trades workers 0.133** (0.0561) Plant and machine operators, and assemblers 0.0692 (0.0598) Economic Sector of Primary Occupation (Base Category: Private Households) Agriculture,hunting, forestry and fishing 0.0648 (0.0661) Mining and Quarrying 0.579*** (0.0864) Manufacturing 0.226*** (0.0784) Electricity, gas and water supply 0.332*** (0.103) Construction 0.105 (0.0986) Wholesale and Retail trade; repair etc;hotels and restaurants 0.100 (0.0831) Transport, storage and communication 0.305*** (0.0887) Financial intermediation,insurance,real estate and business services 0.258** (0.115) Community, social and personal services 0.223*** (0.0723) Catering and accommodation 0.193** (0.0848) Geographic Location Type(Base Category: Rural) Urban 0.0974*** (0.0372) Race Group (Base Category: African) Coloured 0.0763 (0.0692) Asian/Indian 0.000442 (0.162) White 0.509*** (0.0762)
  • 28. 28 Results and Analysis Age and Age squared The estimated effect of age confirms the hypothesis that age has a parabolic relationship with labour market income, first increasing until it reaches a turning point and then decreasing. The approximate percentage change a one year increase in age has on income can be determined using the function: 3.25 – 0.066(present age). This is in line with Mincer’s (1974:46) original findings. The age variable was significant at the 1% level, whereas the age squared variable was only significant at the 5% level. This lower significance could be due to the age interval used in the analysis, which ranged from 15-64 years. The turning point of the age equation was 49.26 years, beyond this point additional years reduced labour market income. The sample excluded many people that were beyond the turning point. This may explain the lower significance of age squared variable. Education The results for education confirms the hypothesis; on average greater education increases labour market income. People with greater than primary school education, on average, earn significantly more than those without any schooling. The effect education has on income increases in size in proportion to education level, except for in the case of postgraduate degree level education, which has a coefficient slightly smaller than the coefficient for undergraduate degree level education. One potential reason for this is that an individual’s (1) VARIABLES Model 1 Marital Status (Base Category: Not Married) Married 0.147*** (0.0403) Perceived Health Status(Base Category:Poor to Fair) Good to Excellent 0.233*** (0.0809) Observations 3,906 R-squared 0.614 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations)
  • 29. 29 time may be better spent attaining work experience than a postgraduate degree after completing an undergraduate degree. Undergraduate degree level education has the greatest impact on an individual’s labour market income, with an estimated labour market income of approximately 131.7% more than individuals with no schooling. Education levels of diploma and above have a larger estimated average impact on log labour market income than any individual characteristic. Tenure The initial hypothesis is correct, as tenure increases so does labour market income. The tenure variable is significant at the 1% level, and contributes to the explaining how labour market income is determined. This result is consistent with Daniels and Rospabé’s (2005:11) findings previously stated. A 1 year increase in an individual’s tenure will increase that individual’s labour market income by approximately 1.72%. Tenure’s impact on labour market income will be smaller than age’s positive impact on labour market income for individuals younger than 23 years. Beyond 23 years of age, the estimated positive effect of an additional year at a firm has larger increase in labour market income than a 1 year increase in age. Average Hours Worked per week An increase in average hours worked per week significantly increases labour market income, this is consistent with the initial expectation. The effect of an increase in average hours worked is significant at the 1% level. This result is consistent with Daniels and Rospabé’s (2005:11) findings previously stated. A 1 hour increase in average hours worked per week increases labour market income by approximately 0.329% on average. Union Membership The results for union membership are in line with the previously stated hypothesis. The effect of union membership is significant at the 1% level. Union membership increases an individual’s labour market income on average by approximately 21.7% compared to individuals that are not union members, this increase confirms the findings of Bosch (2006). This large increase is a testament to the power of unions in South Africa.
  • 30. 30 Occupation The results support the expectation that occupations with more complex job requirements pay more. This is supported by the finding that all occupations are estimated to have a positive impact on income when compared to the base category, elementary occupations. The manager occupation had the largest impact on labour market income, followed by skilled agricultural, forestry and fishery occupation, significant at the 1% level. A manager earns approximately 64% more labour market income than an individual involved in an elementary occupation. Income earned by plant and machine operators and assemblers is not significantly different from income earned by those involved in elementary occupations. The remaining occupations have a significant effect on labour market income at either the 5% or 1% level. Sector: All sectors are estimated to pay more than the private household sector. Income earned by those involved in the agricultural, hunting, forestry and fishery, construction and wholesale and retail trade sectors are not significantly different from the private household sector, at the 10% level. In contrast to expectations, the mining and quarrying sector pays the highest wage premium, an individual working in mining and quarrying sector earns approximately 57.9% more than an individual in the private household sector. This is followed by the electricity, gas and water supply sector where an individual earns approximately 33.2% more than an individual in the private household sector. The 3rd and 4th highest paying sectors are the transportation, storage and communication sector and manufacturing sector. A possible reason for the given sector results is that the mining and quarrying sector, as well as the electricity, gas and water supply sector, to a lesser degree, are highly unionised and active in wage negotiations, and are both heavily regulated by government (Benhura and Gwatidzo, 2013:10). Further research shows that although the transportation sector and manufacturing sector have the highest sectorial determinations, the actual wages paid in these sectors is below these regulated minima (Benhura and Gwatidzo, 2013:10).
  • 31. 31 Gender The results are consistent with the prior stated predications. Males are expected to earn approximately 26.3% more labour market income, on average, than females. The gender variable therefore significantly explains variation in log labour market income. This provides evidence of gender wage discrimination. Race The results show that only the income earned by those in the White race category is significantly different from individuals in the African race category at the 1% significance level. Income earned by those in the Coloured and Asian/Indian race categories are not significantly different from incomes earned by those in the African race category. This is evidence of the legacy of apartheid, where White individuals still earn significantly more than previously disadvantaged individuals. A White individual will earn, approximately 50.9% more than an African individual. This effect is twice as large as the effect that union membership has, and nearly twice as large as the effect gender has, on log labour market income. This illustrates that there is significant wage inequality on racial lines. Marital Status In line with expectations, an individual who is married will earn, on average, approximately 14.7% more than a single individual. This result is significant at the 1% significance level. Head of Household The head of household variable does not significantly explain variation in log labour market income at the 10% significance level. This is contrary to the findings of Miles (1997). This could be because, since 1997, there has been a shift towards dual income households in society, with more than one household member earning a salary (Pew Research Centre, 2015). There is thus less onus on the head of the household to provide all of the household income.
  • 32. 32 Health Status In line with expectations an individual with good to excellent health earns approximately 23.3% more labour market income than an individual with poor to fair health, significant at the 1% level. Health status appears to have a larger effect on labour market income than the marital status and union membership, which reflects the relative importance of this variable in determining labour market income. Geographic Area: As expected an individual living in an urban area earns more than an individual living in a rural area, significant at the 1% level. The results show that an individual living in an urban area approximately earns 9.74% more labour market income than an individual living in a rural area. Comparatively both health status and marital status appear to have larger effect on labour market income than geographic area. Province The province categories were all significant determinants of income relative to the Eastern Cape, at the 10% significance level. Most provinces were significant at the 1% level. As expected inhabitants of all provinces were estimated on average to earn greater labour market income than inhabitants of the Eastern Cape. The results are inconsistent with the prior stated assumptions regarding, which province would have the largest effect on log labour market income. Although Gauteng had a large coefficient, Mpumalanga had the largest coefficient at 0.457. This means that an individual living in Mpumalanga earns, approximately, 45.7% more labour market income than an individual living in the Eastern Cape. Further research needs to be conducted to explain why Mpumalanga had the greatest effect on log labour market income. Gauteng had the second largest effect on log labour market income, which is consistent with the prior stated reasoning. An individual in Gauteng earns approximately 36.5% more labour market income than an individual living in the Eastern Cape. Being from the Western Cape, on the other hand, had a smaller effect on labour market income than being from the North West province. A potential reason for why being from the
  • 33. 33 North West province has a higher effect, could be due to the higher concentration of mines in the North West relative to the Western Cape (Statistics South Africa, 2015:62). Mines in South Africa have stronger unions relative to other sectors, which impacts wages negotiations (Gernetzky, 2015:1). English Writing Level An individual with a very good English writing level earns, approximately, 20.8% more labour market income than an individual with no English writing ability, significant at the 1% level. This is consistent with the prior stated expectations. Good English writing ability, enhances an individual’s earnings potential. Policy Recommendations The education variable has the largest expected impact on labour market. If the overall level of education in the country is improved income levels should rise, reducing inequality. Therefore it is essential that South Africa reduces the dropout rate within the education system. A recommendation is that the compulsory education requirement should be raised from grade 9 to matric. This should increase the student retention ratio in the schooling system and result in more employees with matric qualifications. Government could also pass a policy requiring private schools to fund or provide discounted fees for a certain number of underprivileged children, in ratio to the size of the school. This should increase the availability to better schooling for the underprivileged, and support a reduction inequality. Government should emphasise the importance of English at schools and ensure that teachers are competent and qualified to teach. This could be done through better incentive programmes, stricter screening of potential teachers and improvement of school administration. Occupation has a large impact on labour market income. Government should develop and implement apprenticeship programs that train individuals and connect them with employers in sectors with high wage premiums, providing an opportunity for people to move out of elementary occupations and into higher paying occupations.
  • 34. 34 Minimum sectorial wages in certain sectors should either be implemented, increased or reinforced, especially in the private households sector that pays the lowest wages. The minimum sector wage in the transportation sector is not effectively reinforced and thus is ignored by employers (Benhura and Gwatidzo, 2013:10). The minimum sector wage for the agriculture, hunting, forestry, and fishing sector should be increased, and then strictly regulated. Due to the sparseness of the employees in the farming industry is it difficult for unions to be active, due to transportation costs (Benhura and Gwatidzo, 2013:10). Government could reinforce the Board-Based Black Economic Empowerment Act to reduce income inequality on racial lines. Government should encourage healthy living programs and improve the state of national healthcare to counteract the impact of poor health on wages. This could include interventions like tax incentives for healthy food and healthy choice and a full review of the administration of national healthcare services. Regression Diagnostics Various tests and observations of the data and sampling methods can be made to determine whether the assumptions under which the above regression model was estimated hold or not. Assumption 1: Linear in Parameters As noted above, for the estimated regression coefficients to be unbiased the underlying population model should be linear in parameters. The relationship between the natural log of labour market income and the independent variables outlined in this papers model must be linear. Considering the complex mechanism by which a person’s wage is determined it is highly unlikely that wage in reality will be exactly linearly related to some independent variable. One means of further testing this assumption is observing a scatter plot of the numerical independent variables against the dependent variable. Observing the leftmost column of figure 7 of the natural log of labour market income versus the numerical independent variables, we do not observe a clear linear relationship between log labour market income
  • 35. 35 and the numerical independent variables. This is indicated by the vast range in the independent variables observed across different levels of log labour market income. There is however, an upward tendency in each independent variable indicating some linearity between the two, though this is not clear. The above noted high unlikelihood and the observed scatter plot of log labour market income vs numerical independent variables leads us to believe that log labour market income is not exactly linear in parameters in the population. This is means the above estimated regression coefficients are likely to be biased. Figure 7: Graph Matrix of Log Labour Market Income vs Numerical Independent Variables Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations) Assumption 2: Random Sampling The NIDS wave 3 dataset is a cross sectional dataset used as a sample representing the South African population, thus in order for the estimated regression coefficients to be unbiased each Log Labour Market Income Best age - years agesq tenure lnhoursworked 0 5 10 20 40 60 20 40 60 0 2000 4000 0 2000 4000 0 50 0 50 0 2 4 6
  • 36. 36 member of the South African population would have to have an equal probability of being included in the sample. The NIDS wave 3 dataset is the third cross-section of data collected from households that were selected before the initial cross-sectional survey (NIDS wave 1) was conducted (De Villiers et. al., 2013). Selection was done at the Household level for the NIDS database. Extensive effort was put into the initial random selection of households for the first wave of NIDS data, thus it is reasonable to assume from a household selection perspective that households had an equal probability of initial inclusion in the NIDS wave 1 dataset (De Villiers, Leibbrandt and Woolard, 2009). As NIDS is a panel of data there is an inherent selection bias created as the random selection is not conducted again for the NIDS wave 3 instalment, households that have moved or split up are however still tracked slightly mitigating this effect (De Villiers et. al., 2013). Even though the households that were attempted to be surveyed were randomly selected there may be inherent biases created by the response rates of survey participants dependent on their attitudes to being surveyed i.e. there tends to be a higher attrition rate for White and Asian population groups (De Villiers et. al.,2013). This would create a bias and thus a non-random sample as the likelihood of these population groups being included in the dataset would be greatly reduced. This is a violation of the random sampling assumption. This problem is partially mitigated through the use of sample weighting in our regression output. Assumption 3: No Perfect Collinearity The third assumption in the classical linear regression model relates to the relationship between independent variables. For the estimated regression coefficients to be unbiased there would have to be no perfect collinearity between independent variables. Perfect collinearity occurs when one independent variable can be expressed as a function of one or many other independent variables. When running our regression no variables were identified as being perfectly collinear by Stata, there is thus no perfect collinearity between independent variables in the regression model. The third Classical linear regression model assumption thus holds. Despite the fact that there is no perfect collinearity between the regression coefficients, high levels of multicollinearity - high correlation - between independent variables could also be
  • 37. 37 problematic at this would result unnecessarily inflated standard errors of the estimated regression coefficients. To test for high multicolinearity the variance inflation factors for each of the independent variables used in the model was calculated. High variance inflations factors could be a cause for concern, as they could result in regression results being reported as insignificant when they are in fact significant. This problem is largely mitigated by the large sample size use of 3906 observations. The variance inflation factors of the independent variables used in the specified regression model are listed in table 9 below. The majority of the variance inflation factors seem to be very low, below 3, indicating a low linear correlation between independent variables, there are some variables that have variance inflation factors that stand out. Variables with relatively higher variance include age, age squared, all the variables in the education category particularly some secondary or secondary equivalent and matric or matric equivalent and all variables in the English writing ability category, particularly very well and fair. It is logical the age and age squared would be highly linearly related to other independent variables as age squared is derived from age, the high linear correlation between these variables however does not affect our interpretation as they are significant in the model despite their high variance inflation factor. This seems to also hold for the education category variables and English writing proficiency variables as well as those variables with the highest linear correlations still come out to be significant in the model.
  • 38. 38 Variable VIF 1/VIF Variable VIF 1/VIF Household Headship 1.15 0.866461 Occupation Age 57.27 0.01746 Managers 1.38 0.725352 Age^2 57.47 0.017402 Professionals 2.25 0.443468 Tenure 1.92 0.52215 Technicians and associate professionals 1.28 0.778701 Average Hours Worked Per Week 1.09 0.913501 Clerical support workers 1.41 0.706957 Education Category Service and sales workers 1.77 0.56543 Completed or Some Primary 8.43 0.11864 Skilled agricultural, forestry and fishery workers 1.03 0.974026 Some Secondary or Secondary Equivalent 20.25 0.049383 Craft and related trades workers 1.48 0.675968 Matric or Matric Equivalent 20.44 0.048928 Plant and machine operators, and assemblers 1.68 0.593825 Diploma 9.5 0.105314 Sector Degree 3.85 0.259421 Agriculture,hunting, forestry and fishing 1.86 0.537143 Postgraduate Degree 4.15 0.241221 Mining and Quarrying 2.11 0.474574 English Writing Ability Manufacturing 2.07 0.48383 Very well 10.05 0.099489 Electricity, gas and water supply 1.36 0.735153 Fair 7.1 0.140917 Construction 1.81 0.553698 Not well 3.47 0.288062 Wholesale and Retail trade; repair etc;hotels and restaurants 1.82 0.548861 Province Transport, storage and communication 2.02 0.496063 Western Cape 3.26 0.306755 Financial intermediation,insurance,real estate and business services 1.8 0.556207 Northern Cape 1.44 0.693214 Community, social and personal services 3.76 0.26607 Free State 1.97 0.507873 Catering and accommodation 1.49 0.671778 KwaZulu-Natal 2.77 0.36145 Geographical Area: Urban 1.67 0.59914 North West 2.21 0.452445 Race Group Gauteng 4.7 0.212903 Coloured 1.76 0.568912 Mpumalanga 2.17 0.461312 Asian/Indian 1.01 0.989399 Limpopo 2.18 0.459004 White 1.45 0.68931 Gender: Male 1.37 0.727353 Marital Status: Married 1.36 0.735863 Health Status: Good-Excellent 1.07 0.930983 Union Membership: Yes 1.4 0.714639 Table 9: Variance Inflation Factors of Independent Variables Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations) Assumption 4: Zero Conditional Mean The zero conditional mean assumption holds if the expected value of the error term given any of the independent variables is equal to zero. This assumption will fail if the model is misspecified and there is a variable excluded from the model that is significant in explaining variation in log labour market income and correlated with another independent variable. There are multiple variables that have not been accounted for in our model that could have an effect on log labour market income, these include inherent mental ability, parent’s education, overall work experience, attitude towards work and leadership ability. Inherent mental ability and education category are likely to be correlated as those with greater mental ability tend to have education. This is likely to cause a positive bias as education and mental ability are probably positively related to each other and mental ability is probably positively related to labour market income. Attitude towards work is likely to be positively related to average hours worked per week and the natural log of labour market income. This will probably result in the coefficient on average hours worked being positively biased.
  • 39. 39 Performing a Ramsay RESET test for omitted variables, table 10 below, we can confirm the above justification for an omitted variables and thus a misspecified model. The null hypothesis of this test is rejected at the 5% level. It can be concluded that there are significant omitted variables that are related to the independent variables and the zero conditional mean assumption does not hold. Table 10: Output of the Ramsay RESET test for omitted variables Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations) Assumption 5: Homoskedasticity As discussed above in the Regression Model section, this model exhibits heteroskedasticity. This has been accounted for through the application of robust standard errors. Assumption 6: Normality of Residuals The normality of residuals assumption is used to obtain the exact sampling distribution of t and F statistics such that hypothesis testing can be carried out on the OLS regression model and coefficients. Observing the kernel density plot of the OLS residuals versus the normal distribution, in figure 8 below, it can be gathered that the residuals are not normally distributed. This is further confirmed by the Shapiro-Wilks test for normality, in table 11 below that is rejected at the 1% level. The statistical tests performed to determine the significance of the regression coefficients above however should not entirely be abandoned. Due to the large sample size used of 3 906 observations we can conclude, using the central limit theorem, that the OLS estimators satisfy the asymptotic normality condition and are thus approximately normally distributed. Ramsay RESET test using powers of the fitted values of log labour market income Ho: model has no omitted variables F(3, 3854) = 4.81 Prob > F = 0.0024
  • 40. 40 Figure 8: Kernel Density Plot of the OLS Residuals versus the Normal Distribution Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations) Table 11: Shapiro-Wilks test for normality on residuals Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations) 0 .2.4.6.8 Density -6 -4 -2 0 2 Residuals Kernel density estimate Normal density kernel = epanechnikov, bandwidth = 0.0915 Kernal Density Plot for Residuals H0: Data is normally distributed Variable Observations W V z Prob>z Residuals 3,906 0.94613 117.035 12.398 0 Shapiro-Wilk W test for normal data
  • 41. 41 Conclusion: The determinants of labour market income were estimated using an ordinary least squares multiple linear regression model. The results showed that the majority of the independent variables included were significant. Prior research provided reasoning as to why variables included were significant. The education variable had the largest impact on log labour market income, with an undergraduate degree education level increasing income by approximately 131.7%, on average, compare to no education. Sector, province, occupation and race also had large impacts on log labour market income. It was found that the white race earns significantly more than any other race, illustrating that racial inequality is still present in South Africa. To reduce income inequality in South Africa a number of recommendations were provided. Predominantly it was advised that interventions be implemented in the education system. If government could increase the compulsory education requirement from grade 9 to matric, it should increase student retention and result in more matric qualifications in the workplace. Expanding on this, private schooling should not only be available to the elite. If it could be mandatory for private schools to fund a certain number of underprivileged children through the education system, this would supplement a reduction in racial inequality. Furthermore, emphasis should be placed on the need to learn English. Government could assist by implementing a stricter screening processes for English teachers, as well investing in incentive programs and stream lining school administration. The government could establish apprenticeship programs to help individuals enter higher wage premium sectors. It was found that wages varied across sector with the private household sector earning the least. Further research confirmed that certain sector minimum wages were ineffective, and actual wages paid were lower. Government should implement and increase minimum sector wages in the private household sector and the agriculture, hunting, forestry, and fishing sector, and should reinforce existing minimum sector wages. In the testing the regression model assumptions it was found that assumption 1, linearity in parameters and assumption 4, zero conditional mean, were violated. The results of the study should thus be interpreted with caution as there is likely to be bias in the regression coefficients. The homoscedasticity and normality of residuals assumptions were also found to be violated, this was accounted for through the application of robust standard errors and the use of a large sample.
  • 42. 1 (1) VARIABLES Model 1 Constant 5.237*** (0.247) Household Headship 0.0179 (0.0213) Age 0.0325*** (0.00732) Age^2 -0.000332*** (9.30e-05) Tenure 0.0172*** (0.00155) Average Hours Worked Per Week 0.00329*** (0.000670) Education Level(Base Categyory: No Schooling) Completed or Some Primary 0.0774 (0.0859) Some Secondary or Secondary Equivalent 0.216** (0.0900) Matric or Matric Equivalent 0.522*** (0.0923) Diploma 0.945*** (0.0974) Degree 1.317*** (0.110) Postgraduate Degree 1.112*** (0.109) English Writing Level Proficiency(Base Category: Not at all) Very well 0.208*** (0.0660) Fair 0.0749 (0.0654) Not well -0.0227 (0.0663) Province (Base Category: Eastern Cape) Western Cape 0.201*** (0.0509) Northern Cape 0.176** (0.0749) Free State 0.174*** (0.0550) KwaZulu-Natal 0.187*** (0.0488) North West 0.350*** (0.0568) Gauteng 0.365*** (0.0431) Mpumalanga 0.457*** (0.0527) Limpopo 0.179*** (0.0542) Gender(Base Category:Female) Male 0.263*** (0.0227) Trade Union Membership (Base Category:No) Part of a Trade Union 0.217*** (0.0243) Appendix: Table IR: Initial regression output
  • 43. 2 (1) VARIABLES Model 1 Primary Occupation (Base Category: Elementary Occupations) Managers 0.640*** (0.0511) Professionals 0.423*** (0.0417) Technicians and associate professionals 0.426*** (0.0500) Clerical support workers 0.213*** (0.0476) Service and sales workers 0.165*** (0.0322) Skilled agricultural, forestry and fishery workers 0.570*** (0.207) Craft and related trades workers 0.133*** (0.0406) Plant and machine operators, and assemblers 0.0692* (0.0380) Economic Sector of Primary Occupation (Base Category: Private Households) Agriculture,hunting, forestry and fishing 0.0648 (0.0503) Mining and Quarrying 0.579*** (0.0538) Manufacturing 0.226*** (0.0469) Electricity, gas and water supply 0.332*** (0.0699) Construction 0.105* (0.0557) Wholesale and Retail trade; repair etc;hotels and restaurants 0.100** (0.0506) Transport, storage and communication 0.305*** (0.0527) Financial intermediation,insurance,real estate and business services 0.258*** (0.0538) Community, social and personal services 0.223*** (0.0397) Catering and accommodation 0.193*** (0.0603) Geographic Location Type(Base Category: Rural) Urban 0.0974*** (0.0292) Race Group (Base Category: African) Coloured 0.0763* (0.0437) Asian/Indian 0.000442 (0.583) White 0.509*** (0.0385) Marital Status (Base Category: Not Married) Married 0.147*** (0.0235) Perceived Health Status(Base Category:Poor to Fair) Good to Excellent 0.233*** (0.0400) Observations 3,906 R-squared 0.614 Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Data Source: NIDS Wave 3 (Southern Africa Labour and Development Research Unit, 2013; own calculations)
  • 44. 1 Reference List: Aliber, M., Simbi, T. (2000): Agricultural Employment Crisis in South Africa. Trade and Industrial Policy Secretariat (TIPS). Department of Trade and Industry. 2 Benhura, M., Gwatidzo, T. (2013): Mining Sector Wages in South Africa. Labour Market Intelligence Partnership (LMIP). 9-10 Bosch, A. (2006): Determinants of public and private-sector wages in South Africa. Research Department. South African Reserve Bank. 9-24 Botha, F. (2010): The impact of educational attainment on household poverty in South Africa. Department of Economics and Economic History. Rhodes: University of Rhodes. 142-143 CEDEFOP. (2011): Labour-market polarisation and elementary occupations in Europe. Luxembourg: Publications Office of the European Union. 10 Daniels, R., Rospabé, S. (2005): Estimating an Earnings Function from Coarsened Data by and Interval Consored Regression Procedure. Development Policy Research Unit. Cape Town: University of Cape Town. 6-12 Department of Labour. (2015): Commission for Employment Equity Annual Report 2014 to 2015. Department of Labour. vii De Villiers, L., Leibbrandt, M. Woolard, I. (2009): Methodology: Report on NIDS Wave 1. Cape Town: Southern Africa Labour and Development Research Unit De Villiers, L., Brown, M., Woolard, I., Daniels, R.C., Leibbrandt, M. (2013): National Income Dynamics Study Wave 3 User Manual. Cape Town: Southern Africa Labour and Development Research Unit Gernetzky, K. (2015): Mining industry, unions and government agree on plan to stem job losses. Business Day, available at: http://www.bdlive.co.za/business/mining/2015/08/31/mining-industry-unions-and- government-agree-on-plan-to-stem-job-losses [2015, October 01]
  • 45. 2 International Labour Organisation. 2015. International Standard Classification of Occupations. Available: http://www.ilo.org/public/english/bureau/stat/isco/isco88/9.htm [2015, October 02]. Leibbrandt, M., Woolard, I.1999. Household Incomes, Poverty and Inequality in a Multivariate Framework. Development Policy Research Unit University of Cape Town. 6-12 Miles, D. 1997.A household level study of the determinants of income and consumption. The Economic Journal. 107(1):23-24. Mincer, J. 1974. Schooling, Experience, and Earnings. New York National Bureau of Economic Research working paper no. 9362. New York: NBER Murphy, K. (1990). Empirical Age-Earnings Profiles Chicago: The University of Chicago Press. 202-203 Pew Research Centre, 2015. The Rise of Dual Income Households [Online]. Available: http://www.pewresearch.org/fact-tank/2014/06/12/5-facts-about-todays-fathers/ft_dual- income-households-1960-2012-2/ [2015, October 02] South African Revenue Services. (2014): Tax Statistics – Highlights. South African Revenue Services and National Treasury. 9 Southern Africa Labour and Development Research Unit. National Income Dynamics Study 2012, Wave 3 [dataset]. Version 1.2. Cape Town: Southern Africa Labour and Development Research Unit [producer], 2013. Cape Town: DataFirst [distributor], 2013 Statistics South Africa. (2012): Income and Expenditure of Households. Report; no P0100. Johannesburg: Statistics South Africa. 11 Statistics South Africa. (2013): Gross Domestics Product: Regional and Annual estimates. Report; no P0441. Johannesburg: Statistics South Africa. 46 Statistics South Africa. (2015): Mid-year population estimate. Report; no P0302. Pretoria: Statistics South Africa. 2
  • 46. 3 Statistics South Africa. (2015): Quarterly Labour Force Survey - Quarter 1. Report; no P0211. Pretoria: Statistics South Africa. 62 Van den Berg, S., Louw, M. (2003): Changing patterns of South African income distribution: Towards time series estimates of distribution and poverty. Paper to the Conference of the Economic Society of South Africa, 17-19 September 2003. Stellenbosch: University of Stellenbosch. 19-23 Williams R. (2015). Heteroskedasticity. Available: https://www3.nd.edu/~rwilliam/stats2/l25 [2015, October 01]. World Bank. (2015): GINI Index World Bank Estimate. The World Bank Development Research Group. Available at: http://data.worldbank.org/indicator/SI.POV.GINI [2015, October 02]