Filtering and clustering GPS time series for lifespace analysis




Morrison, Laura May

Journal Title

Journal ISSN

Volume Title



This thesis focuses on various aspects of community mobility and lifespace. Mobility is of particular interest to those working with the elderly population or patients affected by neurological diseases, such as Alzheimer's and Parkinson's diseases. One aspect of mobility is the number of “hotspots" in a person's daily (or weekly) trajectory, which represent the locations at which an individual remains for a minimum predetermined length of time. The individual demonstrates potential limited mobility if there is only one identified hotspot; the individual is more mobile if there are multiple identified hotspots. Based on GPS time series, we can use cluster analysis to identify hotspots. However, existing clustering algorithms such as k-means and trimmed k-means do not take into account the time dependencies between the location points in the series, and require knowing the number of clusters ahead of time. Thus, the resulting clusters do not represent the subjects' activity centres well. In this thesis we have developed a robust time-dependent clustering criterion that works very well to find clusters. Another aspect of mobility is the total distance travelled. The total distance computed from the original GPS data is inflated as there is noise in the data. Due to the particular characteristics of noise specific to GPS time series, we have investigated the identification of noisy segments of data as well as smoothing techniques. The average amplitude of acceleration is proposed as an appropriate method to identify the large noise that occurs in GPS data. A multi-level trimmed means smoother is proposed as an appropriate method to filter the identified large noise. Three methods were investigated to determine an ellipse that identifies the spatial area an individual purposely moves through in daily life. The classical and robust 95% ellipses contain 95% of the points, but do not necessarily capture the distinct shape of the data. The minimum spanning ellipse over the series with all points in each identified cluster reduced to each cluster's central value captures the shape of the data very well and is proposed as the most appropriate lifespace ellipse. Results are obtained and presented for the subjects available in the mobility study for the total distance travelled and a meaningful lower bound, the number of hotspots, the proportion of time spent in the hotspots, as well as the area of the classical 95% ellipse, robust 95% ellipse and minimum spanning ellipse. In the processing of the data, other problems that had to be addressed include obtaining appropriate estimates for the missing values and translating time series from degrees of longitude and latitude to metres in the Cartesian (x,y) plane.



Clustering, Filtering, Lifespace, Time series