3/19/13

The quality of the Census Bureau's statistics is highly disappointing


Looking closer at the household income distribution reported by the Census Bureau I found an extremely confusing conflict of data on Gini ratio. There are two tables providing annual Gini ratio estimates for all households: “HINC-01. Selected Characteristics of Households, by Total Money” and “Table H-4.  Gini Ratios for Households, by Race and Hispanic Origin of Householder:  1967 to 2011”. The former estimates are available since 1998 in “Detailed Tables”. Figure 1 reveals a dramatic difference between these two estimates for the same years. The HINC-01 time series does not show any Gini ratio increase between 1998 and 2008 unlike the H-4 series.

There is a dramatic 0.02 step in 2009 in the HINC-01 series, which is actually fully related to the change in bin counting in 2009: the original $2500 bins between $0 and $100,000 were replaced by $5000 bins between $0 and $200,000. The change in the range and granularity of data resulted in the observed Gini ratio shift. Figure 2 displays the Lorenz curves for the years between 1994 and 2011 as obtained from the household income distributions in the relevant HINC-01 tables. The Gini ratio step of 0.02 is explained by the difference between the 2008 and 2009 Lorenz curves. Therefore, it is of artificial character  and the HINC-01 curve would be at the level of 0.45 between 2009 and 2011 if to retain the $2500 bins and $100,000 range.

We have recalculated the Gini ratio for all years between 1994 and 2011 from the original household income distribution and found that the calculated curve is in excellent agreement with the HINC-01 one between 2005 and 2011 and coincides with the HINC-4 curve from 1994 to 1996.

These observation are extremely confusing and likely manifest internal changes in the CB procedures of Gini ratio estimation and publication. We cannot understand why formally identical time series differ so much and why the HINC-4 curve demonstrates a growth tendency while the underlying data do not show any reason for that. This difference has also many political implications since the research community, officials, media, and the blogosphere all discuss the growth in household Gini, which is not the case.   
Figure 1. Gini ratios from HINC-01 (detailed)  and HINC-4 (historical) tables. 


Figure 2. The evolution of the Lorenz curves for the household income distribution between 1994 and 2011 in the USA. In 2009, the bin counting was changed from $2500 in the range to $100,000 to $5000 with the upper bin between $195,000 and $200,000.

No comments:

Post a Comment

Sabine Hossenfelder: My dream died, and now I'm here

There is a big difference between Sabina's confession and what I experienced in the 1980s in the Soviet Union as a young scientists. It ...