



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The advantages and disadvantages of using longitudinal and cross-sectional surveys to study travel behavior changes over time. It focuses on the statistical power, standard errors, and inferences that can be made from each type of data. The document also touches upon the use of Generalized Estimating Equations (GEE) and Generalized Linear Models (GLM) in analyzing travel data and the importance of handling attrition in longitudinal studies.
Typology: Schemes and Mind Maps
1 / 7
This page cannot be seen from the preview
Don't miss anything!
Project Battelle 94-16, FHWA, HPM-40 06/12/ Received June 20, 1996
A Discussion Paper
Julie L. Yee and Debbie Niemeier
Doctoral Candidate, Division of Statistics, University of California, Davis Asst. Professor, Dept. Of Civil and Environmental Engineering, University of California, Davis, 95616
The collection of panel data entails following a cohort of individuals with the purpose of monitoring changes over a period of time. For the Puget Sound Transportation Panel (PSTP) data, households were solicited to provide a two-day travel diary for each survey time period. The survey time period is referred to as a “wave,” and a total of five waves of data have now been collected at approximately one year intervals. Ideally, in any longitudinal study, the same group of subjects (individuals) is followed during each wave, thus making possible the observation of any one individual’s travel behavior over time. Longitudinal surveys differ greatly from the collection of repeated cross-sectional data in which an independent sample is collected at each wave to represent the population for that time period. With cross-sectional data, the observed trip information is representative of the population at a single period in time and the temporal aspects of a specific individual’s travel is not necessarily available. In the following discussion we have focused on several pertinent issues regarding the scope and limits of statistical inferences for the two types of data that result from longitudinal and cross- section surveys.
The aim in comparison studies is not only to illustrate the differences between populations, but also to establish some measure of significance on the observed difference. The population of interest in the PSTP is the population of residents residing in the Puget Sound area with statistical questions regarding the dynamics of their individual travel behavior at different time points during a five year interval. Either of the above methods, longitudinal or cross-section surveys, may be used to gather data in order to make computational comparisons regarding the travel behavior differences among representative samples of the Puget Sound population.
The statistics typically used to compare travel differences are means and proportions and are reflections of theme differences among the entire population, Since the result of any sampling procedure is subject to variation, then the amount of stock that can be placed on any estimate, (e.g., a mean) is controlled by the estimate’s standard error. In repeated cross-sectional analyses, standard errors are large whenever large
Project Battelle 94-16, FHWA, HPM-40 06/12/
variations between individuals (not necessarily the same person in each wave) exist, and the power to detect statistically significant differences in the estimates can be undermined. Alternatively, in longitudinal analyses, by identifying those observations that are measured on the same individuals, it is possible to focus on changes occurring within subjects and to make population inferences that are not as sensitive to between- subject variation. A simple illustration helps to elaborate on this point:
Example
There is no significant increase in the mean trip length over the two waves.
In the first example, it is not clear that a widespread increase in average trip lengths is occurring based on inspection of the data. This is supported by an insignificant t-statistic. In the second example, where each observed subject’s trip length increased uniformly by 1, there is greater reason to believe that this is occurring on a widespread basis; this is supported by a very large t-statistic.
The estimated mean increase is the same for both methods, but the standard error plays a key role in determining statistical significance. The Generalized Estimating Equations (GEE) analysis used by Yee and Niemeier [l] is based on this concept. The analysis adjusts the Generalized Linear Model (GLM), which fits a model of completely
Project Battelle 94-I6, FHWA, HPM-40 06/12/
The drawbacks of using longitudinal data have largely to do with coverage problems. Coverage refers to “the set of units constituting the target population” [2] and includes issues associated with both selecting and tracking individual sample respondents. The prominent coverage limitations are:
After the first wave’s recruitment, the study is restricted to the members of that sample although changes in the population may occur.
Despite attempts to locate households from wave to wave, there is invariably a fair amount of attrition.
The design of longitudinal data is particularly well suited for stationary populations. In regionwide transportation studies, this limits the inference to subjects residing long-term in a closed region. However, most regionwide populations, like Puget Sound, are not closed systems. By keeping the sample population fixed, there is a risk of making inaccurate conclusions about the true population which may have changed as a result of influx or outflux of residents with different behavioral characteristics than the indigenous population.
This raises the question of what conclusions can be made with longitudinal data and this question is, in turn, related to understanding the primary goals of the PSTP study. For example, is the primary purpose of the PSTP to monitor changes in travel activity in the Puget Sound area as an aggregate of effects ranging from demographic changes and group dynamics.3 Or is it to monitor changes in travel activity due to individual attributes? The former goal, measuring the aggregate effects by demographic group, can be addressed with independent repeated cross-sectional sampling; each wave’s sample of subjects is allowed to vary with time and with any changing population dynamics. The latter question is probably better addressed with longitudinal data since this data collection method provides more detailed insight on behavior at the person level.
Solon notes that marginal probabilities may be estimated using either longitudinal or repeated cross-sectional data, but certain conditional probabilities may be estimated only with longitudinal data [3]. For example, it is possible to use either longitudinal or repeated cross-sectional data to study whether there was an increase in SOV commute activity between wave 2 and wave 1; but to assess whether a person who uses HOV- transit in wave 1 is more likely to use SOV or HOV in wave 2 requires the use of longitudinal data. Similarly, PSTP reveals that there were greater incidences of HOV- transit commuters transitioning to HOV-pool commuting than to SOV commuting [4]. Likewise, SOV commuters were more likely to become HOV-pool commuters than HOV-transit commuters in later waves.
Project Battelle 94-16, FHWA, HPM-40 06/12/
The optimal survey design would combine the positive attributes of both sampling methods. For example, longitudinal sampling with rotation allows the entry of new subjects which helps to capture any dynamic changes in population composition due to immigration while still retaining a portion of the earlier sample groups to represent any person level changes among the stable population.
Each method, either the standard longitudinal or rotating method of survey, still face possible attrition problems. Attrition embodies the combined sampling problems due to out-migration and non-migratory dropouts of survey respondents and is a potentially serious source for bias. The analysis of incomplete data has been widely studied and methods are continually being developed to handle data when some observations are missing at random. However attrition is known to occur more with some groups of people than with others. In particular, subjects with lower income, lower ages, and lower trip frequencies have a greater tendency to drop out, and this is likely to create an offset in statistical results [5].
Independent repeated cross-sectional sampling can be advantageous in this regard since complete sets of new respondents are continually selected, thus ensuring a steady level of reliability for each successive sample when under stable sampling conditions. Alternatively, the practice of “refreshing” the sample of a longitudinal study with new recruits who fit the profiles of those who drop out has much of the appeal of weighted and stratified repeated cross-sectional sampling without compromising the advantages of repeatedly sampling from participants who do not drop out. This is discussed in more detail in the next section.
The role of weights serves to counter the biases that may result from disproportionately sized sampling strata. The PSTP was collected by a stratified sampling protocol based on mode use proportions derived from previous research. Respondents were recruited using three methods: random telephone digit dialing, contacting prior participants in the Seattle Metro transit surveys, and solicitation of volunteers on randomly selected bus routes. The random telephone digit dialing method was the primary way of collecting participants who drive alone or carpool. Since transit users comprise a very small portion of the population, the latter two methods were used to supplement the random telephone digit dialing in soliciting transit users into the survey, and sampling weights were developed to help control for the disproportionate sizes of the mode groups.
Using weights, however, binds sample mode proportions to prescribed values. Under an independent repeated cross-sectional sampling strategy, this weighted and stratified sampling protocol could fail to reveal changes in mode choice behavior over time. This is less of an impediment in longitudinal studies where it is the sample that remains fixed and mode choices are observed to vary. Unfortunately, attrition in
Project Battelle 94-16, FHWA. HPM-40 06/12/
2.
3.
4.
5.
Yee, J.L. & Niemeier, D.A., “Travel Trends Using the Puget Sound Panel Survey: A Generalized Estimating Equations Approach” Colledge, M. “Coverage and Classification Maintenance Issues in Economic Surveys,” Panel Surveys, Wiley 1989 Solon, G. “The Value of Panel Data in Economic Research,” Panel Surveys, Wiley 1989 Yee, J.L. & Niemeier, D.A., “Task E - Volume 2: Specialized Travel Trends in the Puget Sound Panel Data 1989-1993” Murakami, E. and Watterson, W.T. “Attrition and Replacement Issues in the Puget Sound Transportation Panel,” 70th Annual Meeting of the Transportation Research Board 1991