Basketball is a dynamic sport that requires players to succeed in both offensive and defensive roles, which can significantly influence their overall career trajectories. This report aims to investigate three core questions regarding NBA players’ statistics, particularly how they relate to their positions and skill sets. By examining data collected from the 2021-2022 NBA season, we aim to provide insights into the following:
The findings from these questions will provide a comprehensive understanding of how different positions influence players’ offensive and defensive skills, career longevity, and overall performance. We have also identified unanswered questions and areas for further research to pave the way for future NBA player statistics analysis.
The data set used for this report is the National Basketball Association Player Statistics, sourced in the SCORE Sports Data set repository. This contains 812 player team stints from the 2021-2022 NBA regular season, and reported on a per 100 team possessions scale to normalize for playing time differences. This data includes a variety of variables, including player information, team information, playing statistics, and ratings. We utilize this data to understand a comprehensive overview of players’ performances and explore trends and correlations to answer our research questions.
This plot shows the relationship between defensive rating and field goal percentage, and is faceted by position. We observe that there could be a correlation here, as higher defensive ratings seem to have somewhat lower field goal percentages for centers and small forwards. This trend doesn’t appear to exist for the other 3 positions, however. These results help to answer the question: are good defensive players worse at offense? Our results may suggest good defensive centers and small forwards (high DRTG) are worse shooters (lower FG%), while the opposite may be true for power forwards and shooting guards - the better they are at defense the better they are at shooting as well. For the most part, however, the graphs are very clustered in one spot and therefore do not demonstrate strong correlation trends. This may indicate that there is not much of a difference between how good defensive and worse defensive players shoot.
This box plot shows a different approach to understanding the offensive vs defensive ratings by position. Each box represents the 25th to 75th percentile ratings, with the median marked by a horizontal line inside the box. For most positions, we notice that the median defensive rating is higher than the median offensive rating. This implies that on average, players in these positions might be rated higher defensively than offensively. There seems to be more overall variance of ratings for offensive over defensive, including the percentile range and outliers. The average defensive ratings seem to amount to similar averages across positions, while there is more variability for offensive ratings where centers have the highest median rating and point guard with the lowest. This graph suggests that while some positions show higher defensive ratings, the relationship between offense and defense isn’t necessarily inverse and individual player skills vary greatly. Therefore, the data indicates that being a good defensive player doesn’t inherently mean being worse at offense, though specific roles and trends might influence this dynamic.
This plot demonstrates the average offensive rating - an estimate of points produced per 100 possessions scale and the average defensive rating - an estimate of points allowed per 100 possessions scale by each position. Only players identified as the five main positions, Point Guard, Shooting Guard, Small Forward, Power Forward, and Center were observed in the data set as the remaining positions were double or triple positions that only few players were identified as which obfuscated the data. The data showed that only Power Forwards and Centers had a higher defensive rating than offensive rating while the other three positions had a higher offensive rating than defensive rating. There are no overlap between error bars for each position which suggests that there is statistic significance for these results. This aids our question “Are good defensive players worse at offense?” by showing that that, in general, Power Forwards and Centers are the best defensive players and they are worse at offense than the other positions.
This plot shows the relationship between offensive rating (ORTG) and player age, facetted by position. It is also provides a heat map, indicating the densest parts of the graph. We observe that there doesn’t appear to be much of a relationship at all, as there are a wide range of high and low ORTGs for most ages. The heat map also indicates that most ORTGs range from about 90 to 125, and that essentially all ages share this density of points. Hence, age does not seem to effect ORTG. Moreover, this trend appears to be the same for all positions.
## # A tibble: 5 × 3
## pos ks_stat p_value
## <chr> <dbl> <dbl>
## 1 C 0.135 0.0124
## 2 PF 0.138 0.00657
## 3 PG 0.151 0.00183
## 4 SF 0.127 0.0142
## 5 SG 0.114 0.0110
We first analyze the univariate plot, graph 5, of the distribution of NBA players’ ages to understand the overall trend of the players. From the graph we can that the data is unimodal and skewed right, indicating that most players are in their early 20s instead of late 20s and early 30s. To understand the breakdown we look at graph 6, which shows multiple density histograms of age, one for each position. The p-values for all Kolmogorov-Smirnov tests are less than our alpha value of 0.05 therefore we reject the null hypothesis and conclude that there is significant evidence to suggest that these tests do not follow a normal distribution. An interesting observation that can be made is that for all positions besides Small Forward, the most common age by far is age 24. Small Forwards not following this trend causes their distribution to be slightly less right skewed and closer to normal, also indicated by their ks test having the largest p-value.
To answer our question of whether different positions tend to have stat lines that center around different points, we will explore how clusters in our data form around player positions. The plot shows the 10 pairwise comparisons between positions. For the purpose of visualization, we only used the first two principle components. Overall, the plot shows that there are clusters formed around positions, but some positions are more closely related than others. Centers and power forwards have the most separation in their clusters from point guards and shooting guards. However, small forwards tend have little separation from guards. This suggests that there are clusters formed in our data, but there are not 5 distinct clusters indexed by position. A better interpretation would likely be that there are 2 clusters composed of center and power forwards and small forward, point guard, and shooting guard.
In order to validate our previous statement that we believe there are 2 distinct clusters rather than 5 clusters indexed by position, we plotted the first two principle components and colored the points by the 2 position clusters. The first composed of power forwards and centers and the second composed of shooting guards, point guards, and small forwards. The contour plot and coloring shows that although there is overlap between the 2 clusters, there is still distinction. It is difficult to answer whether there is a better way to group together positions as there are 31 ways to groups the positions into 2 clusters, and even more ways to group them into 3 clusters. However, in graph 5 we saw that all positions’ first two principle components centered around near points, despite there being distinction. So, it is likely that regardless of how the positions are grouped together, there will be overlap in the clusters.
This report evaluated three key questions to understand the interplay between NBA player statistics and their positions, and how these provide insights into career longevity and success.
Our analysis revealed that while some positions, particularly centers and power forwards, tend to have higher defensive ratings than offensive ratings, there is not a straightforward inverse relationship between offensive and defensive skills. The variability and closeness in offensive ratings suggest that player skills vary within each position, which highlights the unique skill sets each player brings to the court. From this we cannot confirm that good defensive players are necessarily worse at offense. Secondly, the analysis of players’ ages across different positions did not correlate strongly with offensive ratings, and therefore showed that career duration is not directly tied to offensive performance. Lastly, our PCA analysis demonstrated that while there are clusters formed around player positions, there is considerable overlap. This suggests for our last question of whether different positions’ stat lines center around different points that there are primarily two clusters: one for centers and power forwards and one for shooting guards, point guards, and small forwards. This indicates that specific roles influence player statistics, but they may center around similar points.
In conclusion, the graphs from our NBA data set shows that performance metrics vary considerably across positions, and there is not always a direct relationship between offensive and defensive skills. Furthermore, there was not enough evidence to conclude that career longevity is necessarily linked to offensive performance, and while there are clusters around different positions, significant overlap exists.
Given that the data set we are working with demonstrates individual player statistics, we are still curious as to how these statistics play a role in actual success: games won, playoff games won, championships won, salary, or even individual awards such as MVP. Hence, this report does not answer any questions related to this success, such as:
We were unable to answer these questions due to a lack of data, but if the aforementioned stats were provided as well as stats spanning over many seasons, future work would be able to analyze and answer these questions.