Exploratory Data Analysis

The Pet Cats UK data set tracks the movements of 101 pet cats across the UK. It includes information such as the cat’s name, sex, location at time of tracking (with time stamps), number of prey caught at a certain time and health information (e.g. food intake, reproductive condition). In addition to the existing variable, we used the location and time variables to calculate distance moved and changes in height and speed.

Among the cats in our data set, 46% are female and 5.61 (+/- 3.49) years old on average. We were interested in looking further into whether there are age and sex differences between the characteristics and activities. Comparing the marginal distribution of age conditioned filled by sex.

The above graph shows that the distribution of age of cats is skewed right with the median landing at 5 years. This is important because it informs about potential factors that might affect the aggregate levels of movements of cats.

In addition, this is inherently a spatial data set so we should also look at the spatial distribution of them as well. To do so we look at a spatial heatmap of the cats observations.

The above graph shows each data point in the cats data set which means that there are multiple data points per cat based on whenever the tracker noted their location. We can see that this data includes cats from the South West Peninsula in Britain. It is clear that each of the main clusters lie primarily in cities which makes sense since these areas are ones in which the human population (and therefore the cat owning human population) is high.

With these facts about the data in mind, we can move into the research questions.

Research Questions

Now that we have a decent idea of what the data set looks like, there are a couple of very important questions we can ask:

  1. Are there demographic differences between how cats move and hunt?
  2. Are cats more active at certain times of the day?
  3. Does cat activity really affect fitness/physical capabilities?

Question 1: Differences in cat movement and hunting capability.

In this section we want to answer the question about what are the demographic differences between cats with different movement characteristics and hunting capabilities. To do this we compared the number of prey caught between male and female cats of different ages.

see that among younger cats (<5yrs old), males caught more prey. For middle aged cats (5-15 yrs), females caught more prey, and the trend reverses with females catching more prey. For the oldest cats (>15 yrs), males again catch more prey.

Looking at movement, we examined the relationship between age and distance moved.

## 
## Call:
## lm(formula = distance ~ age + animal_sex, data = cats.grouped)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.523 -10.865  -5.384   2.727 170.859 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)     26.7704     6.4572   4.146 7.25e-05 ***
## age             -1.5822     0.8015  -1.974   0.0512 .  
## animal_sexMale  -4.7680     5.4386  -0.877   0.3828    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 26.11 on 97 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.04005,    Adjusted R-squared:  0.02026 
## F-statistic: 2.023 on 2 and 97 DF,  p-value: 0.1378

The graph shows that distance moved decreases as cats get older; however, this decline in movement does not differ significantly between genders. Running a regression confirms it. For every one year increase in age, cats move 1.58km/ month less on average (t = -1.97, p =.05), but age is not significant. On average males move 4.77km per month less, but the standard error is high, therefore, the difference is not significant (t = -.88, p = .38)

Question 2: When are cats lazy?

Our second topic of interest is whether cats are more active at certain times of day and whether this differs by sex. To asses this question consider the line graph below.

We do see some trends in the average distance traveled per day in each hour in terms of peaks at around midnight and at 2:00pm. Male and female cats seem to trend similarly but the peak at midnight is dominated by female cats. To look at this problem in a more rigorous statistical matter we can run an anova test.

##                       Df   Sum Sq Mean Sq F value Pr(>F)
## as.factor(df$hour)    23 2.06e+07  895709   0.678  0.872
## Residuals          18090 2.39e+10 1321196               
## 101 observations deleted due to missingness

As the p-value is 0.872, we cannot reject the null hypothesis and conclude that there is any statistically significant difference between the distance traveled in each hour. This is just between each however and not time of day. If we categorized each hour, however, into a bin of morning, afternoon, and night we can then look at the same reports as above in a new light.

Here we see that there are some qualitative differences and cats tend to be least active in the mornings and most active in the evenings. The ANOVA test below, however concludes that again we cannot reject the null hypothesis (since the p-value is 0.451) and state that there is a statistically significant difference between the averaged distance traveled and the time of day.

##                     Df    Sum Sq Mean Sq F value Pr(>F)
## timedf$timeofday     2 2.103e+06 1051475   0.796  0.451
## Residuals        18111 2.392e+10 1320686               
## 101 observations deleted due to missingness

Question 3: Does cat fitness make a difference?

First to get a geographical sense for where cats move, consider the graph below.

We see that there is a wide gap between the movement of some cants and others. There are some that travel vast swaths of distances between tracked points and others who tend to move in very local regions. Two of the cats somehow go into the ocean but we can attribute this to just errors in the data collection and processing on the end of the people who created this data set. Most cats tend to travel locally but the few cats that do travel vasts distances can perhaps be attributed to also owners taking their cats on trips to neighboring cities.

We then investigate whether cats who prefer spending time indoors or outdoors differ in how fast they are.

## 
## Call:
## lm(formula = speed ~ hrs_outdoors * age, data = cats_individual2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1011.0  -434.7   -75.5   326.6  2983.0 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      2373.5752   290.9481   8.158 1.31e-12 ***
## hrs_outdoors      -25.0378    21.2954  -1.176    0.243    
## age                 4.1545    40.2522   0.103    0.918    
## hrs_outdoors:age   -0.8447     3.1252  -0.270    0.788    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 605.5 on 96 degrees of freedom
##   (1 observation deleted due to missingness)
## Multiple R-squared:  0.06314,    Adjusted R-squared:  0.03387 
## F-statistic: 2.157 on 3 and 96 DF,  p-value: 0.09816

From the plot above, we see that some of the fastest cats are actually the ones who spend almost all of their time indoors, as reflected by the dots with the bluest color that are positioned at the upper end of the y-axis. The relationship between ground speed and hours spent outdoors is not significant, which testifies that indoor-loving cats could be as fast as outdoor-loving ones. We can observe slightly more of the younger cats that prefer being outdoors, but overall there isn’t a clear association between age and indoor-outdoor preference.

Conclusion

In this analysis, we explored how and when cats in the UK move, their behaviours as well as how individual characteristics such as age and gender affect behaviours and movement. We found that there were age differences for propensity to move (older cats move less); however there were no significant differences detected for any of the other comparisons. Since our data had relatively few observations (101 cats) and incomplete tracking data, it is possible that these findings are the result of gaps in the data. Further exploration with more cats and more tracking for longer is needed to verify these results. Future research could also look into question such as how difference in breed affect behaviour or the type of prey caught.