Table of Contents | |
1. Key Variables | 5. Assessing Instrument Strength |
2. Data Overview | 6. Reduced Form |
3. Summary Statistics | 7. IV Estimation: Second Stage |
4. OLS Regression of Poverty Rates on Segregation | 8. Conclusion |
9. References |
This project replicates the analysis from Elizabeth Ananat’s paper, The Wrong Side(s) of the Tracks: The Causal Effects of Racial Segregation on Urban Poverty and Inequality (American Economic Journal: Applied Economics, 2011). The study investigates how racial segregation influences urban poverty and inequality, using railroad track layouts as an instrumental variable (IV) to estimate the causal impact of segregation on poverty rates.
- dism1990: 1990 dissimilarity index.
- herf: Railroad Division Index (RDI).
- lenper: Track length per square kilometer.
- povrate_w: White poverty rate in 1990.
- povrate_b: Black poverty rate in 1990.
- area1910: Physical area of the city in 1910 (1000 sq. miles).
- count1910: Population in 1910 (1000s).
- black1910: Percent Black in 1910.
- passpc: Streetcars per capita in 1915.
- incseg: Income segregation in 1990.
- pctbk1990: Percent Black in 1990.
This data is from the AER’s website, which links to the ICPSR’s data repository. Anyone can sign in to access the replication data files.
Each observation represents a city. The dataset spans multiple years, capturing variables related to racial composition, economic outcomes, and urban infrastructure. We focus on the relationships between segregation (measured by the dissimilarity index) and poverty rates among Black and White populations.
We begin by examining summary statistics for key variables: dism1990
, herf
, lenper
, and poverty rates (povrate_w
, povrate_b
). These statistics provide an overview of the distribution and variability of these variables across cities.
summary_stats <- df %>% summarise(
Mean = colMeans(., na.rm = TRUE),
SD = sapply(., sd, na.rm = TRUE),
Min = sapply(., min, na.rm = TRUE),
Max = sapply(., max, na.rm = TRUE))
A simple OLS regression is run to explore the relationship between racial segregation and poverty rates for both White and Black populations.
Regression Model:
A one standard deviation increase in the segregation index is associated with a one percentage point decrease in white poverty and a 2.5 percentage point increase in black poverty.
This model estimates how changes in segregation are associated with changes in poverty rates. However, this approach does not account for potential confounders that could bias the results.
Given that the OLS model may suffer from endogeneity (i.e., omitted variable bias), we apply an instrumental variables (IV) approach to estimate the causal effect of segregation on poverty. The instrument used is the railroad division index (herf
), with track length per square kilometer (lenper
) as a control, RDI is correlated with segregation but assumed not to directly affect poverty rates, as railroads were laid out randomly or according to transport efficiency before the cities developed.
First Stage Regression:
In the first stage, we regress segregation on the instrument(s) to check whether the instruments are significantly correlated with segregation.
A one standard deviation increase in the RDI is associated with a 5 point (0.14 ∗ (0.357) = 0.049) increase in the segregation index.
To assess the strength of the instrument, we examine the F-statistic from the first stage regression. A value greater than 10 indicates a sufficiently strong instrument. If the instrument is weak, IV estimates will be unreliable. Our model has an F statistic of 14.98.
summary(model)$fstat[1]
Next, we estimate the reduced form equation, regressing poverty rates directly on the instrument (RDI and track length), without first modeling segregation. This step helps to understand the relationship between the instrument and the outcomes.
In the second stage, we regress poverty rates on the predicted values of segregation from the first stage. This provides an IV estimate of the causal effect of segregation on poverty rates.
Second Stage Model:
model_BI<-felm(povrate_b~lenper|0|(dism1990~herf+lenper),data=df)
The coefficient on the predicted segregation variable reflects the causal impact of segregation on poverty, addressing potential biases in the OLS estimate.
The results indicate segregation has a statistically significant causal impact on the poverty rates of the black population (increasing segregation induces higher poverty rates among the black population). The regression also indicates a statistically significant causal impact on the poverty rates among the white population (increasing segregation induces lower poverty rates among the white population).
We test the robustness of our results by including additional control variables, such as city size, historical population composition, and other demographic characteristics. These controls help account for other factors that could influence both segregation and poverty.
The IV approach provides a more reliable estimate of the causal impact of segregation on poverty rates compared to OLS, addressing endogeneity concerns. By using instruments related to railroad infrastructure, we obtain a clearer understanding of how segregation influences poverty, particularly for Black populations. The results indicate segregation has a statistically significant causal impact on the poverty rates of the black population (increasing segregation induces higher poverty rates among the black population).
Comparing the simple regression
with the IV regression
Ananat, Elizabeth Oltmans. “The Wrong Side(S) of the Tracks: The Causal Effects of Racial Segregation on Urban Poverty and Inequality.” American Economic Journal: Applied Economics, vol. 3, no. 2, 1 Apr. 2011, pp. 34–66, www.nber.org/system/files/working_papers/w13343/w13343.pdf, https://doi.org/10.1257/app.3.2.34.