Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Measuring Income-Related Health Inequality: Concentration Index Note, Summaries of Introduction to Computing

An introduction to the concentration index, a statistical tool used to quantify income-related inequality in health variables. how to compute the concentration index and obtain a standard error for it using grouped data. It also discusses the interpretation of the index and its applications in health equity analysis.

What you will learn

  • What is the interpretation of a negative concentration index value in the context of health equity analysis?
  • What is the concentration index and how is it used in health equity analysis?
  • How is the concentration index computed from grouped data?

Typology: Summaries

2021/2022

Uploaded on 09/12/2022

kitriotak
kitriotak 🇮🇳

4.5

(13)

220 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Quantitative Techniques for Health Equity Analysis—Technical Note #7
The concentration index Page 1
The Concentration Index
Introduction
The concentration index [1-3] and related concentration curve (see Technical Note #6) provide a means of
quantifying the degree of income-related inequality in a specific health variable. For example, it could be
used to quantify the degree to which health subsidies are better targeted towards the poor in some countries
than others [4], or the degree to which child mortality is more unequally distributed to the disadvantage of
poor children in one country than another [5], or the extent to which inequalities in adult health are more
pronounced in some countries than in others [6]. Many other applications are possible. This Note
describes how to compute the concentration index, and how to obtain a standard error for it. Both the
grouped-data and micro-data cases are considered.
The concentration index defined
The concentration index is defined with reference to the concentration c urve (q.v.), whic h graphs on the x-
axis the cumulative percentage of the sample, ranked by living standards, beginning with the poorest, and
on the y-axis the cumulative percentage of the health variable corresponding to each cumulative percentage
of the distribution of the living standard variable. Figures 1 provides an example of a concentration curve,
where the health variable is ill-health, which in this example is higher amongst the poor than amongst the
better-off. The concentration index is defined as twice the area between the concentration curve, L(p), and
thelineofequality(the45
0line running from the bottom-left corner to the top-right). So, in the case where
there is no income-related inequality, the concentration index is zero. The convention is that the index
takes a negative value when the curve lies above the line of equality, indicating disproportionate
concentration of the health variable among the poor, and a positive value when it lies below the line of
equality. If the health variable, is a ‘bad’ such as ill health, a negative value of the concentration index
means ill health is higher among the poor.
Figure 1: Ill-healthconcentration curve
0.00
0.20
0.40
0.60
0.80
1.00
0.00 0.20 0.40 0.60 0.80 1.00
cumulative % of persons,
rankedby economic status
cumulative % of ill health
L
(p)
0% 100%
0%
100%
The grouped-data case
Computing the concentration index from grouped data
The concentration, C, index is easily computed in a spreadsheet program using the following formula [7]:
C=(p1L2-p2L1)+(p2L3-p3L2)+…+(pT-1LT-pTLT-1),
pf3
pf4
pf5

Partial preview of the text

Download Measuring Income-Related Health Inequality: Concentration Index Note and more Summaries Introduction to Computing in PDF only on Docsity!

The Concentration Index

Introduction

The concentration index [1-3] and related concentration curve (see Technical Note #6) provide a means of quantifying the degree of income-related inequality in a specific health variable. For example, it could be used to quantify the degree to which health subsidies are better targeted towards the poor in some countries than others [4], or the degree to which child mortality is more unequally distributed to the disadvantage of poor children in one country than another [5], or the extent to which inequalities in adult health are more pronounced in some countries than in others [6]. Many other applications are possible. This Note describes how to compute the concentration index, and how to obtain a standard error for it. Both the grouped-data and micro-data cases are considered.

The concentration index defined

The concentration index is defined with reference to the concentration curve (q.v.), which graphs on the x - axis the cumulative percentage of the sample, ranked by living standards, beginning with the poorest, and on the y -axis the cumulative percentage of the health variable corresponding to each cumulative percentage of the distribution of the living standard variable. Figures 1 provides an example of a concentration curve, where the health variable is ill-health, which in this example is higher amongst the poor than amongst the better-off. The concentration index is defined as twice the area between the concentration curve, L ( p ), and the line of equality (the 45 0 line running from the bottom-left corner to the top-right). So, in the case where there is no income-related inequality, the concentration index is zero. The convention is that the index takes a negative value when the curve lies above the line of equality, indicating disproportionate concentration of the health variable among the poor, and a positive value when it lies below the line of equality. If the health variable, is a ‘bad’ such as ill health, a negative value of the concentration index means ill health is higher among the poor.

Figure 1: Ill-health concentration curve

0.00 0.20ranked by economic status cumulative % of persons,0.40 0.60 0.80 1.

cumulative % of ill health

L ( p )

0% 0% 100%

100%

The grouped-data case

Computing the concentration index from grouped data

The concentration, C , index is easily computed in a spreadsheet program using the following formula [7]:

C = ( p 1 L 2 - p 2 L 1 ) + ( p 2 L 3 - p 3 L 2 ) + … + ( p T-1 L T - p T L T-1),

where p is the cumulative percent of the sample ranked by economic status, L ( p ) is the corresponding concentration curve ordinate, and T is the number of socioeconomic groups.

Table 1 provides a worked example. It shows the number of births in each wealth group over the period 1982-92 in India. Expressing these as percentages of the total number of births, and cumulating them gives the cumulative percentage of births, ordered by wealth. This is what is plotted on the x -axis in the concentration curve diagram and gives us p. (See the Technical Note on the concentration curve for the concentration curve graph for these data.) Also shown are the under-five mortality rates (U5MR) for each of five wealth groups. Multiplying the U5MR by the number of births gives the number of deaths in each wealth group. Expressing these as a percentage of the total number of deaths, and cumulating them, gives the cumulative percentage of deaths for the corresponding percentage of births. This is what is plotted on the y -axis in Figure 1, and gives us L ( p ). The final column shows the terms in brackets in the formula above, there being T -1 terms in total. The sum of these is –0.1694, which is the concentration index. The negative concentration index reflects the higher mortality rates amongst poorer children.

Table 1: Under-five deaths in India, 1982-

Wealth No. of rel % cumul % U5MR No. of rel % cumul % Conc. group births births births per 1000 deaths deaths deaths index

Poorest 29939 23% 23% 154.7 4632 30% 30% -0. 2nd 28776 22% 45% 152.9 4400 29% 59% -0. Middle 26528 20% 66% 119.5 3170 21% 79% -0. 4th 24689 19% 85% 86.9 2145 14% 93% -0. Richest 19739 15% 100% 54.3 1072 7% 100% 0. Total/average 129671 118.8 15419 -0.

Computing a standard error for the concentration index with grouped data

A standard error can be computed for C in the grouped data case using a formula given in Kakwani et al. [2]. Let n denote the sample size, T the number of groups, ft the proportion of the sample in the t th group, μ t the mean value of health variable amongst the t th group, and C the concentration index. Let R (^) t be the fractional of the t th group, defined as

R (^) t f ft

t = (^) = +

1 1 2

1

and hence indicating the cumulative proportion of the population up to the midpoint of each group interval. The variance of C is given by eqn (14) in Kakwani et al.:

var( C ) [ ( )] ( )

n

f a C n t t^ t f^ R^ C

T t t^ t^ t

T

1 1

1 1 2 2 2 2 2 1

2 μ^1 σ

where σ t^2 is the variance of μ t ,

a t = t ( R t − − C )+ − q t − − qt

μ μ

(^2 1 2 )

q (^) t f

t

1 μ γ^1 μ^ γ^ γ

which is the ordinate of L ( p ), q 0 =0, and p t = ¦γ t = 1 f γ R γ.

Case where variances of the group means are unknown

In many applications, the standard errors of the group means will be unknown. For example, the data might have been obtained from published tabulations by income quintile. In such a case, the second term in the expression for the variance of C will necessarily be assumed to be equal to zero. However, in addition, one needs to replace n by T in the denominator of the first term, since there are in effect only T observations, not n.

are used to measure inequality in malnutrition between poor and better-off children. Malnutrition is measured by the child’s height-for-age percentile score (HAP) in a hypothetical population of well- nourished children assembled by the US National Center for Health Statistics (NCHS). Thus a score of 50 means the child in question is at the median height-for-age in the well-nourished reference population. We rank children by per capita household consumption (PCCONS). Initially, the commands below use sample weights (WT), as the 1998 VLSS is not nationally representative without them. These weights, or expansion factors, indicate the number of people in Vietnam which each represents.

Computing the concentration index from micro-data

The concentration index ( C ) can be computed very simply by making use of the “convenient covariance” result [8-10]:

C = 2 cov( y (^) i , R (^) i ) / μ,

where y is the health variable whose inequality is being measured, μ is its mean, R (^) i is the i th individual’s fractional rank in the socioeconomic distribution (e.g. the person’s rank in the income distribution), and cov(.,.) is the covariance. Where the data are weighted, a weighted covariance needs to be computed, and a weighted fractional rank needs to be generated [10].

Stata commands for computing the concentration index

The command GLCURVE (a program downloadable from the Stata website) can be used to generate the fractional rank in the distribution of income or whatever measure of living standards is being used. This can be used for weighted data. The COR command (weighted if necessary), along with the means and covariance options, can then be used to obtain the mean of the health variable and the covariance between it and the fractional rank variable. In the malnutrition example, the GLCURVE command generates the fractional rank variable CONRNK from the PCCONS variable. The COR command then calculates the mean of the HAP variable and the covariance between the fractional rank variable CONRNK and HAP.

glcurve pccons [fw=wt] , pvar(conrnk) cor conrnk hap [fw=wt] , c m

The covariance between HAP and CONRNK is 1.1505 and the mean of the HAP is 14.024 (meaning the average Vietnamese child is only at the fourteenth percentile in the reference population). This gives a concentration index of 0.1641—i.e. a tendency for better-off children in Vietnam to be taller (and better nourished) than poor children.

SPSS commands for computing the concentration index

The fractional rank variable can be computed by the RANK command. The CORRELATION command with the covariance option can be used to obtain the covariance between the health variable and the fractional rank variable. The DESCRIPTIVES command can then be used to calculate the mean of the health variable. All these commands need to be preceded by the WEIGHT option if the sample is weighted. The SPSS syntax below is for the malnutrition example.

WEIGHT BY wt. RANK VARIABLES=pccons (A) /RFRACTION into RNKCON /PRINT=YES /TIES=MEAN. CORRELATIONS /VARIABLES=rnkcon hap /STATISTICS XPROD /MISSING=PAIRWISE. DESCRIPTIVES VARIABLES=hap rnkcon /STATISTICS=MEAN.

Computing a standard error for the concentration index—the micro-data case

There are two ways to compute the standard error of C with micro-data. The second “convenient regression” method is easier to implement, and seems likely to be at least as precise. It also has the

advantage of yielding an estimate of the concentration index itself. Neither, however, is appropriate with weighted data. In the example, we have assumed for illustrative purposes that the VLSS data are self- weighting. The value of C obtained ignoring the weighted character of the data is 0.1731.

The formula method

The first is to use the formula given in eqn (22) in Kakwani et al. [2]:

¼

º «¬

ª

n C (^) n n i 1 ai C var( )^112

where

i i^ (^ Ri C )^ qi qi

y a = 2 − 1 − + 2 − − 1 − μ

and

i qi (^) n 1 yi

1 μ^ γ

is the ordinate of the concentration curve L ( p ), and q 0 =0.

This is easily computed in Stata with the following commands, which are for the malnutrition example.

glcurve hap , glvar(glhap) sortvar(lnpcexp) pvar(incrnk) egen meany = mean(hap) gen ccurve = glhap / meany sort ccurve gen cclag = ccurve[_n - 1] gen a = (hap/meany) * (2*incrnk-1-.173122) + 2 - cclag - ccurve gen asq = a^ sum asq

The GLCURVE command generates GLHAP, which, divided through by the mean of the health variable HAP, gives the concentration curve ordinate CCURVE (the analogue of q or L ( p )). The next two commands generate the lagged value of L ( p ), or q (^) i -1. Inserting the estimated value of C in the next command generates the variable a. The mean of a^2 is then obtained, which can then be used to compute var( C ) manually using the formula above. In the VLSS example, the mean of a^2 is equal to 2.1741, which gives a value of se( C ) equal to 0.0124.

The convenient regression method

The “convenient covariance” result above can be used to define a convenient regression for the concentration index [2], equal to

i i

i R R u

y »= + + ¼

º « ¬

ª α β μ

2 σ 2

where (^) σ (^) R^2 is the variance of the fractional rank variable. The estimate of β is equal to the concentration

index, C. Estimating this equation is an alternative to (but equivalent to) the convenient covariance method. It also gives rise to an alternative interpretation of the concentration index as the slope of a line passing through the heads of a parade of people, ranked by their consumption or SES, and their height proportional to the value of their health variable, expressed as a fraction of the mean. The standard error of β provides an estimate of the standard error of C , but is inaccurate since the nature of the fractional rank variable induces a particular pattern of autocorrelation in the data. The formula above gets round this, but an alternative is to use the Newey-West [11] regression estimator, which corrects for autocorrelation, as well as any heteroscedasticty. The commands below implement this for the malnutrition example.