Happy 2023 from friiends of #ed3learning everywhere & ED && Ed3EN...... 6 years after serving as teen allied bomber command burma, dad norman macrae met von neumann

Friday, March 31, 2017

digital research on futire of youth and education

in china there is a lot of buzz about new learning about societies etc than can be mined from big data sources such as we chat - please send us examples chris.macrae@yahoo.co.uk meanwhile here is an american example of deprived inner city youth based on researching twitter

Urban mobility and neighborhood isolation in America’s 50 largest cities

Qi WangNolan Edward PhillipsMario L. Small, and Robert J. Sampson
  1. Edited by Douglas S. Massey, Princeton University, Princeton, NJ, and approved June 6, 2018 (received for review February 10, 2018)
View Full Text


Living in disadvantaged neighborhoods is widely assumed to undermine life chances because residents are isolated from neighborhoods with greater resources. Yet, residential isolation may be mitigated by individuals spending much of their everyday lives outside their home neighborhoods, a possibility that has been difficult to assess on a large scale. Using new methods to analyze urban mobility in the 50 largest American cities, we find that residents of primarily black and Hispanic neighborhoods—whether poor or not—are far less exposed to either nonpoor or white middle-class neighborhoods than residents of primarily white neighborhoods. Although residents of disadvantaged neighborhoods regularly travel as far and to as many different neighborhoods as those from advantaged neighborhoods, their relative isolation and segregation persist.


Influential research on the negative effects of living in a disadvantaged neighborhood assumes that its residents are socially isolated from nonpoor or “mainstream” neighborhoods, but the extent and nature of such isolation remain in question. We develop a test of neighborhood isolation that improves on static measures derived from commonly used census reports by leveraging fine-grained dynamic data on the everyday movement of residents in America’s 50 largest cities. We analyze 650 million geocoded Twitter messages to estimate the home locations and travel patterns of almost 400,000 residents over 18 mo. We find surprisingly high consistency across neighborhoods of different race and income characteristics in the average travel distance (radius) and number of neighborhoods traveled to (spread) in the metropolitan region; however, we uncover notable differences in the composition of the neighborhoods visited. Residents of primarily black and Hispanic neighborhoods—whether poor or not—are far less exposed to either nonpoor or white middle-class neighborhoods than residents of primarily white neighborhoods. These large racial differences are notable given recent declines in segregation and the increasing diversity of American cities. We also find that white poor neighborhoods are substantially isolated from nonpoor white neighborhoods. The results suggest that even though residents of disadvantaged neighborhoods travel far and wide, their relative isolation and segregation persist.
A large and diverse literature based on longitudinal surveys, randomized control trials, and millions of administrative tax records has produced increasingly convincing evidence that growing up or living in a poor neighborhood undermines life chances (14). A major explanation for this effect is that residents of poor neighborhoods, especially predominantly black and Hispanic poor neighborhoods, are geographically isolated from middle-class environments of opportunity (58). As one influential theorist put it, residents of such neighborhoods have limited contact or sustained interactions with the individuals and institutions of “mainstream society” (5). Among other factors, social isolation in poor black neighborhoods can potentially limit young people’s access to middle-class role models, safe environments, and institutional resources, as well as adults’ access to people with information about jobs (910).
Nevertheless, the neighborhood isolation explanation has relied on the implicit assumption that social interactions are limited to one’s neighborhood of residence (1112). In an increasingly interconnected and mobile society, this assumption is questionable (13). Indeed, although we know that few people spend all of their waking hours within their neighborhoods, we know little about how many neighborhoods they visit on an everyday basis or how far they travel. Furthermore, such dynamics may depend on the poverty or race of their own or the receiving neighborhoods in ways not yet understood. Thus, whether low-income blacks and Hispanics are, in fact, socially or geographically isolated depends on the opportunities provided by largely unknown aspects of their urban mobility (1415).
To date, studies relevant to this fundamental question have relied on three types of data. First, studies have examined commuting ties, which focuses on adults’ travel between home and work (1216). However, commuting does not include neighborhoods experienced through leisure, errand activities, or visits to friends and family, all of which affect the extent of isolation. Second, several studies have used travel diaries collected by volunteers (151718). While such methods produce rich data on the multiple locations visited by respondents, they are typically limited to one city and constrained by sample size limitations, given the onerous demands placed on study participants. These constraints are especially important given potential differences between cities. For example, travel patterns in cities with expansive public transit systems (e.g., New York City or Chicago) may differ from those in cities where driving is the primary mode of transportation (e.g., Houston or Los Angeles). These differences may also exacerbate inequalities in neighborhood isolation across race and class lines. Third, a few studies have examined the differences in mobility patterns among different social groups (1920), as well as their geographical interactions (21), using geolocation records from cell phones and social media platforms. However, only a few of these studies have examined race or class differences in mobility and none have done so across a large sample of cities.
Traditional studies that examined neighborhood isolation using surveys, field experiments, or tax records do not track everyday mobility for large populations with sufficient detail for statistical analyses. These data are intrinsically static, and subsequently they do not capture dynamic phenomena well. Data from travel diaries as well as the burgeoning use of social media are qualitatively different, as they capture the dynamism of mobility patterns. However, these studies (particularly the former) have been limited in scope, inhibiting comparisons across cities when accounting for demographic characteristics (18). Our study builds upon prior research by analyzing large-scale social media data from the 50 most populous American cities to estimate urban travel patterns for large populations and by examining travel to all locations in the region, thereby capturing exposure patterns for all groups (2223). We focus particular attention on exposure to nonpoor and white neighborhoods among residents of poor and minority neighborhoods (2425).

Data and Methods

We use unique, publicly available Twitter data across the 50 largest population centers in the United States. Building on existing literature (2627), we collect more than 650 million geotagged micromessages, called tweets, from October 1, 2013, to March 31, 2015, for each of the 50 cities to conduct this study. Twitter users have the choice of opting into a function that publicly identifies the location, recorded in latitude and longitude coordinates, from where their messages are sent. These geotagged tweets create the initial dataset for our analysis. The high granularity and scale of the data provide an unprecedented level of detail in understanding how people move across space over 500 d.
Our analyses require knowing the neighborhood individuals live in, the neighborhoods they travel to, and the distances and frequencies of those journeys; none of which is directly reported by Twitter. We use density-based spatial clustering of applications with noise (DBSCAN*) to estimate the block group where individuals live due to its efficiency in handling a large dataset, accuracy in identifying clusters, and controllability of cluster sizes (2829). For Twitter data, this controllability is particularly important. We improve upon previous research that used clustering algorithms by implementing several additional data processing rules (SI Appendix, section 2.1). Although Twitter data use the Global Positioning System (GPS) for generating geographical information with a relatively high resolution, with the worst-case accuracy of 7.8 m with 95% confidence (30), the precision of the data can be affected by atmospheric effects, receiver quality and sky blockage, and noise caused by weather or device factors, such that two tweets sent from the exact location could be reported as slightly different locations. DBSCAN* helps address this issue (3132). Using DBSCAN*, we determine the distance between pairs of locations in a cluster and thus control which locations should be included in a cluster and which should be treated as noise. We use the DBSCAN package in R to estimate the block group of each user’s home location, a process that merges precision while restricting the ability to pinpoint any individual’s home address.
Complete details on the process of identifying a Twitter user’s home location in New York City are presented in SI Appendix, section 2.1Fig. 1 shows a representation of the underlying data for the New York City commuting zone once we estimated home residences, with boundaries of the city in red. Fig. 1A shows the density of estimated residences of Twitter users for the area around New York City, and thus coverage of individuals, which is distinct from the raw number of tweets. Our theoretical interest is in the mobility of residents of the central city, in this case New York City, and their travel within the city as well as the wider commuting zone, a larger area that captures geographies closer to metropolitan areas including the suburbs. As an example, Fig. 1B shows the travel pattern of one resident of New York City in this region; again, the city boundary is red. One can see the spread of mobility, but in this case clustered in lower Manhattan and with more visits to locations in New Jersey than the Bronx, Queens, and Staten Island combined. Given our substantive focus, we leave for future research the analysis of mobility patterns of suburban and exurban users.
Although geotagged Twitter data have been used in recent years to understand the movement of people across space (3335), neither Twitter users in general nor those who geotag their tweets are fully representative of the local population (36). To help address this issue, we use a weighting mechanism based on the ratio of Twitter users to the true population in the block group (SI Appendix, section 2.2). We also estimate a fixed-effects regression model that takes into account that Twitter use varies by gender and age and that accounts for all unobserved differences among cities (32). Both methods leverage the large number of data points and the fact that the estimated residence block groups of the Twitter users include 34,641 of the 36,252 block groups with at least 300 residents in the 50 largest cities. In the commuting zones, geotagged tweets are in 105,160 of the 105,767 block groups (more statistics are presented in SI Appendix, section 2.2 and Fig. S2). For the regression-based results, we retrieved median ages and the percentages of males from all block groups in the 50 cities.
Results from unweighted, weighted, and regression-based data are highly consistent. For exposition, we report unweighted, weighted, and only the key regression-based results. All analyses reported here are the results from the neighborhoods in the entire commuting zones. We separately analyzed travel within the central city boundaries only and obtained substantively similar results (SI Appendix, Figs. S3–S6). Also, robustness assessments of the results are discussed in SI Appendix, section 2.5.

Homogeneity in Travel Radius

Our first measure of mobility is the travel radius, which is the average of number of meters individuals traveled within the city’s commuting zone over the course of 18 mo. Each individual radius (rg) is calculated using the following formula:
where n is the total number of recorded locations for an individual, t is each visited location, ϕ is the latitude, φ is the longitude, c is the individual’s estimated home geographical coordinate, and g is the radius of the earth in meters. While home locations are only within cities’ boundaries, the visited locations may be anywhere within cities’ commuting zones. We exclude the top 1% of individuals’ travel distances to eliminate the possible bias resulting from anomalous long-distance travels. The mobility radius is the median of the weighted travel distances from the home location to all of the traveled locations not within the home cluster.
There is a high degree of uniformity in travel distances at the aggregate level (Fig. 2). We mapped the distributions of the radii of mobility among residents in the 50 cities; Fig. 2 AFshow the distributions in six cities; Fig. 2G shows the aggregated distributions. The average travel radius within commuting zones is 5,292.0 m, with an SD of 1,003.6 m. When limited to the city boundaries, the average travel radius is 3,142.2 m, with an SD of 1,281.3 m. Car-dependent cities such as Los Angeles are typically argued to have different mobility profiles than those with strong public transportation, such as New York City. However, the results show high homogeneity in the distributions across the cities, while the differences are in the average distances. For example, Los Angeles and New York City differ in their average distances, as previously noted (37), but not in their aggregated city distributions. Their travel radii are 7,214.1 m and 4,642.8 m, respectively, and their distributions are both long-tailed. In fact, the radii from all individuals across the 50 commuting zones follow Burr distributions. This finding supports general theories on the regularity of urban dwellers’ mobility patterns and the evolution of a small set of basic urban principles that operate locally (3839).
Fig. 2.
Travel distances. (AF) The distributions of radii for individuals’ everyday mobility patterns in the six largest cities. Median radii for New York City: 4,642.8 m; Los Angeles: 7,214.1 m; Chicago: 4,916.0 m; Houston: 6,929.7 m; Philadelphia: 3,397.7 m; and Phoenix: 6,589.9 m. (G) The distributions of the 50 largest cities in the United States. (H) Comparison of the kernel density estimations of the underlying distributions of normalized weighted and unweighted results (SI Appendix, section 2.3).

Group Differences: Radius and Spread

To compare residents of different neighborhoods, we classify block groups into poor and nonpoor based on whether the proportion of residents living under the federal poverty line was greater than 30% (a threshold of 40% produced similar results). We similarly classified block groups as majority non-Hispanic white, non-Hispanic black, or Hispanic using a threshold of 50% (a threshold of 70% produced similar results). There are too few block groups with majority Asian populations to permit reliable analyses for that group. Our first comparison examines class and race differences in the travel radius. Results are shown in Fig. 2H.
The radii were normalized to facilitate comparisons since cities have different sizes of commuting zones. For each city, we divided the median travel distance by the average distance from the centroid of the city to the centroids of the farthest five block groups in the city’s commuting zone. The weighted normalized radius is ∼0.046 for residents of poor neighborhoods and roughly 0.051 for nonpoor neighborhoods. Under both specifications, radii from black neighborhoods are the highest, a finding that aligns with reports based on survey data of the size of activity spaces in Los Angeles (40). For residents of nonpoor neighborhoods, the weighted radius is 0.056 for the black neighborhoods, higher than the 0.051 for white neighborhoods. For poor neighborhoods, it is 0.049 for black neighborhoods compared with 0.045 for white neighborhoods. The true difference in meters between the average radii of poor black neighborhoods and nonpoor whites is surprisingly small; it is 235.3 m across all commuting zones. Similar travel patterns were observed if travels are limited to cities’ boundaries (SI Appendix, Fig. S3), where the average radius of poor black neighborhoods is almost the same as nonpoor white neighborhoods’ average radius.
Since isolation entails which neighborhoods individuals visit, we also examine the number of different neighborhoods that city residents traveled to within the 50 cities and the larger commuting zones. We label this value the “spread.” Although the poor generally have travel radii comparable to those of other groups, they may nonetheless travel to far fewer neighborhoods (and be more isolated as a result). We examine every geotagged message location over the 18-mo period and overlay the coordinates with the American Community Survey 5-y data on the block-group level, thereby identifying the block groups visited by each individual. For each city, we count the numbers of block groups visited by residents of poor and nonpoor white, black, and Hispanic majority groups. The results for spread within commuting zones are shown in Fig. 3.
Fig. 3.
Urban mobility spread in commuting zones. Residents from black neighborhoods visit a marginally greater number of block groups based on both weighted and unweighted results.
The numbers of neighborhoods traveled to in the commuting zones are 18.2 ± 4.47 (unweighted results) and 17.8 ± 2.47 (weighted results). The average number of days to the median number of the visited neighborhoods is 304.1. Residents of black neighborhoods, whether poor or nonpoor, tend to visit a slightly greater number of neighborhoods than all other groups. Class or economic differences overall are negligible for travel throughout the commuting zone or the city proper—people in poor neighborhoods are not especially likely to experience fewer neighborhoods than those in nonpoor neighborhoods. For visits within cities, the numbers of neighborhoods traveled to are 13.00 ± 3.99 (unweighted results) and 14.66 ± 4.21 (weighted results) across all types of neighborhoods (shown in SI Appendix, Fig. S4).

Group Differences: Composition

To test prominent accounts regarding isolation among residents of poor minority neighborhoods (5), we estimate the rates of exposure to white neighborhoods (pw), to nonpoor neighborhoods (pn), and to nonpoor white neighborhoods (pnw). Similar to spread, the rates of exposure to different types of neighborhoods are the proportions of each type of neighborhood visited over all neighborhoods visited (analyses using frequencies produced similar results; see SI Appendix, section 2.5). Fig. 4 displays the rates for different race and class groups relative to the commuting zones’ baseline expectations, which are the proportions of all neighborhoods in the commuting zone that encompass the three types that researchers have named “mainstream” (SI Appendix, section 2.4). We note that the literature’s use of “mainstream” does not imply normative judgment; it denotes the types that are most prevalent in US society and that provide resources (5).
Fig. 4.
Urban mobility composition adjusted by the proportions of block groups in cities' commuting zones of that demographic type. (A) The adjusted, expected proportions of individuals traveling to white neighborhoods. (B) The proportions of individuals traveling to nonpoor neighborhoods. (C) The proportions of individuals traveling to nonpoor white neighborhoods.
Fig. 4 shows clear discrepancies in residents’ exposure to mainstream neighborhoods based on the demographics of their home neighborhoods (see SI Appendix, Fig. S5 for results within city boundaries). Across Fig. 4, residents from predominantly nonpoor white (nw) neighborhoods have the highest proportion of their visited neighborhoods as white (pw), nonpoor (pn), and nonpoor white (pnw) neighborhoods. The weighted pw(nw) is 11.2% points above the baseline expectations; pn(nw) is 9.49% points below and pnw(nw) is 8.95% points above them, respectively. In contrast, residents from predominantly minority neighborhoods are more socially isolated from “mainstream neighborhoods” defined by race and class, often significantly falling below the baselines. The proportion of “mainstream” neighborhoods over all visited neighborhoods by residents from poor black (pb) neighborhoods are pw(pb) −29.1%, pn(pb) −35.6%, and pnw(pb) −29.5% points, respectively. In addition, residents from Hispanic neighborhoods are also relatively isolated from nonpoor and white neighborhoods outside their home locations. Notably, residents of nonpoor black and Hispanic neighborhoods experience less exposure to either nonpoor white neighborhoods or white neighborhoods than residents of poor white neighborhoods. Moreover, their rates of exposure to nonpoor neighborhoods are only marginally higher than those of residents of poor white neighborhoods.
Overall, the results indicate that race contributes more to differences in exposure rates to nonpoor white neighborhoods than economic background. An examination of weighted exposure to nonpoor white neighborhoods (Fig. 4C) illustrates that the differences between white and minority neighborhoods of the same economic class range from 15.0% to 37.3% points, while the differences between poor and nonpoor neighborhoods of the same race are between 0.56% and 16.7% points. However, residents of poor white neighborhoods are still highly divergent in urban mobility patterns from nonpoor white neighborhoods—the latter are the outlier in the data and consistent with the pulling away of upper-income neighborhoods from the rest of society (41). These findings underscore the continued primacy of racial and economic segregation in the social structure of American cities, but in this case well beyond the borders of local neighborhoods (8).
Our final analysis combines all of the largest cities’ block groups to estimate a fixed-effects regression model of exposure. We only considered exposure to white nonpoor (or “middle class”) neighborhoods because it differs the most in composition from the segregated urban poor and it has been a core focus in theories of urban poverty. We use the model from our main specification to predict the exposure of residents from each of the six types of neighborhoods (race by class) to nonpoor white neighborhoods, adjusting for neighborhoods’ age and sex compositions. In this model, block groups are nested within cities. The exposure is estimated within each city as the deviation from the baseline exposure possible given the number of white nonpoor neighborhoods in the city. We estimate robust SEs clustered on cities and adjust for the median age and percentage of male residents in each neighborhood (grand mean centered). The fixed effects model controls for any unobserved city characteristics that could potentially affect the results.
Formally, for block group i in city j, the model takes the form
where β1 to β7 reflect coefficients of the independent variables for nonpoor black, nonpoor Hispanic, poor white, poor black, and poor Hispanic neighborhoods, respectively (with white nonpoor the reference category) as well as the block groups’ proportion of male residents and median age, and where αj is the unobserved city-invariant effect and uij is the error term. We also estimated mixed effects models of the unweighted and weighted results for robustness checks, with similar results.
The race and class predictions from our model (shown in Fig. 5) are consistent with the results in Fig. 4C, after accounting for age and gender composition and unique city effects. Namely, racial differences are more important than the poor versus nonpoor distinction in the exposure of urban dwellers to nonpoor, predominantly white neighborhoods in the commuting zone. For example, the predicted probabilities that residents of poor black and poor Hispanic neighborhoods visit nonpoor white neighborhoods are 0.32 and 0.29 below the expected baseline. Conversely, it is only 0.05 below the baseline for residents from poor white neighborhoods, a difference of 0.27 and 0.24, respectively.
Fig. 5.
Predicted proportions of visits, relative to baselines in cities’ commuting zones, to nonpoor white neighborhoods by race and class of home neighborhoods, adjusted for the age and gender composition of home block groups and cities’ fixed effects.
Notably, however, the gaps are even greater between nonpoor neighborhoods. The predicted probabilities for residents from nonpoor black and nonpoor Hispanic neighborhoods visiting nonpoor white neighborhoods are 0.29 and 0.24 below the baseline. In contrast, the predicted probability for nonpoor white neighborhoods is 0.14 above the baseline, which yields differences of 0.43 and 0.38, respectively. While we find minimal differences between black and Hispanic neighborhoods by class, we find a large difference (0.19) by class for white neighborhoods. However, residents from poor white neighborhoods still have a higher predicted probability of exposure to nonpoor white neighborhoods than residents from nonpoor black and Hispanic neighborhoods. Race thus trumps class in mobility patterns compositionally despite the fact that there are minimal to no differences in distances traveled and the numbers of neighborhoods visited by race. Although there are small but significant differences between poor white and nonpoor white neighborhoods, the lack of meaningful or significant differences in radii and spread between nonpoor white neighborhoods and black neighborhoods (poor or not) holds up when we adjust for age and gender composition in similar analyses. Additionally, the results for travel across neighborhoods within a city show a similar but even stronger trend (SI Appendix, Fig. S6).


Our study does not make claims about individuals’ travel patterns based on their particular race or class. While we use data on individuals’ tweets, our findings pertain to data at the block group (or neighborhood) level, a focal interest in research on concentrated poverty. In addition, although our results are based on almost 400,000 users who posted geocoded tweets, sample representativeness is a potential limitation. Our approach to this challenge was to compare weighted and unweighted results, and to adjust key results by age and sex composition. These approaches have complementary strengths and weaknesses and yield strongly consistent results. At a minimum, our results are valid for the large population of Twitter users that enable geotagging. Prior research suggests that these individuals are generally younger and more affluent and that this demographic has larger activity spaces (4042). We therefore would expect that differences in mobility patterns would be even greater for the general population.
Another limitation is that individuals may have different tweeting habits based on where they travel (43). While an option might have been to attempt to predict from which of their locations people are most likely to tweet, researchers have not yet developed methods, based on either multiple combined data sources or natural language processing, to accurately predict such locations (44), a limitation currently faced by users of not only Twitter data but also other large-scale data resources, such as cell phone records or GPS. Consequently, the potential heterogeneity in tweeting habits cannot be taken into consideration in a way that fully addresses the representativeness of locations. Finally, our study assesses the potential for contact (i.e., physical copresence) through exposure rather than an observed interaction; note, however, that the former is unequivocally the prerequisite for the latter. These and other issues should be addressed in future research with alternative data sources that provide a ground truth of the distribution of representative locations and durations of exposure to social environments (SI Appendix, section 3).
Our analyses suggest several important conclusions. Residents of poor minority neighborhoods do not limit their lives to those neighborhoods. In addition, they appear to travel about as widely across their cities and to as many neighborhoods as those of other groups. Nevertheless, they seem much less exposed to middle-class or white neighborhoods than those living in middle-class neighborhoods, supporting the “mainstream” underexposure hypotheses, perhaps in ways more far-reaching than initially intended (5). It is notable that residents of minority neighborhoods, regardless of class, are less exposed to nonpoor or white neighborhoods than even those of poor white neighborhoods. The finding aligns with other smaller-scale studies based on surveys and GPS data (144546), suggesting that heterogeneity across race mediates the effects of neighborhood poverty (7847). We also find that residents of poor white neighborhoods are less exposed to mainstream areas than those in nonpoor white ones. Importantly, these results hold even after accounting for differences in cities’ demographics, indicating broader trends across the 50 most populous cities. An exposition of these trends would not have been possible without large-scale, dynamic data.
Although racial segregation and racial income inequality in the United States may have decreased (48), we find race still matters more than poverty for relative exposure to middle-class neighborhoods. These findings, among a population that by definition is technologically connected, imply that racial segregation is operating at a higher-order level than typically recognized: Racial segregation is manifest not only where people live but also where they travel throughout a city and whom they are exposed to (4950). Our research thus provides evidence that although the United States is becoming increasingly diverse, the interactions across race and class groups that ultimately contribute to societal integration are not taking place (22). Racial segregation reaches well beyond one’s home, indicating the importance of considering mobility interactions across neighborhoods.


This work was supported by MacArthur Foundation Grant 14-107644-000-USP, National Science Foundation Grant SES-1637136, and the Project on Race, Class and Cumulative Adversity at Harvard University funded by the Ford Foundation and the Hutchins Family Foundation. The paper was also supported by the Boston Area Research Initiative. All human mobility data used in the study are publicly available from Twitter. We use geolocation information embedded in tweets that was intentionally revealed by the Twitter users. To protect the confidentiality of any given user’s movement trajectory, all users’ information was encrypted, and all data are reported in nonidentifiable form. To comply with Twitter’s policy, no collected information has or will be distributed except in nonidentifiable form.


  • Author contributions: Q.W., N.E.P., M.L.S., and R.J.S. designed research; Q.W., N.E.P., M.L.S., and R.J.S. performed research; Q.W., N.E.P., and R.J.S. analyzed data; and Q.W., N.E.P., M.L.S., and R.J.S. wrote the paper.
  • The authors declare no conflict of interest.
  • This article is a PNAS Direct Submission.
  • This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10.1073/pnas.1802537115/-/DCSupplemental.


View Abstract

No comments:

Post a Comment