For years players have been recognised by their defensive position on court and also informally known by names such as Rim Runner, Spot Up Bigs, and by their ball movements. This is the results provided on Wikipedia if we search for type of playersin basketball:
But these are defensive position not player type, and similar results can be seen on official NBA website.
Thus there is a urgent need to analyse players on the basis of their performance/game on court rather than position, in order to understand players and team better. And take decisive actions in direction of improvement.
We will be scraping data from Basketball Reference website and will be referring NBA official website for further help.
In order to scrape data from the above stated website, we used library urlopen & BeautifulSoup to access the data available on website.
Player | Pos | Age | Tm | G | GS | MP | FG | FGA | FG% | 3P | 3PA | 3P% | 2P | 2PA | 2P% | eFG% | FT | FTA | FT% | ORB | DRB | TRB | AST | STL | BLK | TOV | PF | PTS | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Álex Abrines | SG | 25 | OKC | 31 | 2 | 19.0 | 1.8 | 5.1 | .357 | 1.3 | 4.1 | .323 | 0.5 | 1.0 | .500 | .487 | 0.4 | 0.4 | .923 | 0.2 | 1.4 | 1.5 | 0.6 | 0.5 | 0.2 | 0.5 | 1.7 | 5.3 |
1 | Quincy Acy | PF | 28 | PHO | 10 | 0 | 12.3 | 0.4 | 1.8 | .222 | 0.2 | 1.5 | .133 | 0.2 | 0.3 | .667 | .278 | 0.7 | 1.0 | .700 | 0.3 | 2.2 | 2.5 | 0.8 | 0.1 | 0.4 | 0.4 | 2.4 | 1.7 |
2 | Jaylen Adams | PG | 22 | ATL | 34 | 1 | 12.6 | 1.1 | 3.2 | .345 | 0.7 | 2.2 | .338 | 0.4 | 1.1 | .361 | .459 | 0.2 | 0.3 | .778 | 0.3 | 1.4 | 1.8 | 1.9 | 0.4 | 0.1 | 0.8 | 1.3 | 3.2 |
3 | Steven Adams | C | 25 | OKC | 80 | 80 | 33.4 | 6.0 | 10.1 | .595 | 0.0 | 0.0 | .000 | 6.0 | 10.1 | .596 | .595 | 1.8 | 3.7 | .500 | 4.9 | 4.6 | 9.5 | 1.6 | 1.5 | 1.0 | 1.7 | 2.6 | 13.9 |
4 | Bam Adebayo | C | 21 | MIA | 82 | 28 | 23.3 | 3.4 | 5.9 | .576 | 0.0 | 0.2 | .200 | 3.4 | 5.7 | .588 | .579 | 2.0 | 2.8 | .735 | 2.0 | 5.3 | 7.3 | 2.2 | 0.9 | 0.8 | 1.5 | 2.5 | 8.9 |
5 | Deng Adel | SF | 21 | CLE | 19 | 3 | 10.2 | 0.6 | 1.9 | .306 | 0.3 | 1.2 | .261 | 0.3 | 0.7 | .385 | .389 | 0.2 | 0.2 | 1.000 | 0.2 | 0.8 | 1.0 | 0.3 | 0.1 | 0.2 | 0.3 | 0.7 | 1.7 |
6 | DeVaughn Akoon-Purcell | SG | 25 | DEN | 7 | 0 | 3.1 | 0.4 | 1.4 | .300 | 0.0 | 0.6 | .000 | 0.4 | 0.9 | .500 | .300 | 0.1 | 0.3 | .500 | 0.1 | 0.4 | 0.6 | 0.9 | 0.3 | 0.0 | 0.3 | 0.6 | 1.0 |
7 | LaMarcus Aldridge | C | 33 | SAS | 81 | 81 | 33.2 | 8.4 | 16.3 | .519 | 0.1 | 0.5 | .238 | 8.3 | 15.8 | .528 | .522 | 4.3 | 5.1 | .847 | 3.1 | 6.1 | 9.2 | 2.4 | 0.5 | 1.3 | 1.8 | 2.2 | 21.3 |
8 | Rawle Alkins | SG | 21 | CHI | 10 | 1 | 12.0 | 1.3 | 3.9 | .333 | 0.3 | 1.2 | .250 | 1.0 | 2.7 | .370 | .372 | 0.8 | 1.2 | .667 | 1.1 | 1.5 | 2.6 | 1.3 | 0.1 | 0.0 | 0.8 | 0.7 | 3.7 |
9 | Grayson Allen | SG | 23 | UTA | 38 | 2 | 10.9 | 1.8 | 4.7 | .376 | 0.8 | 2.6 | .323 | 0.9 | 2.1 | .443 | .466 | 1.2 | 1.6 | .750 | 0.1 | 0.5 | 0.6 | 0.7 | 0.2 | 0.2 | 0.9 | 1.2 | 5.6 |
Detailed analysis can be found in Jupyter notebook attached above as Archtype of NBA Players, here are some finding from this section.
This is corrolarogram representing correlation between all the feature in the data.
I tried to model linear relationship between all the scoring variables in the processed data.
Clustering can be largly classified into following 4 types, where every type uses unique technique to measure differences between the data points. :
- Exclusive Clustering
- Overlapping Clustering
- Hierarchical Clustering
- Probabilistic Clustering We will using all above stated methods excluding Overlapping Clustering.
Using Elbow plot & Silhouette Coefficient we decide the number of clusters
This is a "bottom-up" approach: each observation starts in its own cluster, and pairs of clusters are merged as one moves up the hierarchy.
we will be moving forward in our analysis with the clustering technique which has highest value of Silhouette Coefficient value.
Parallel plot gives very good visualisation about the similarities and disimilarities between different clusters. Here every line describes a player and every colour describes a cluster. This is interactie plot, so you can slide over the respective feature axis, fix their limits and analyse the cluster you wish.
- From above boxplot we conclude very valuable information that can be used by NBA team coachs to retrospect over their team and strengthen it further by removing and drafting new player of particular type.
- If we consider mean as optimum value for our conclusion, then from above boxplot we can conclude that in order to make a strong team we should have the following proportion of every type of players:
TYPE 1 : 25%
TYPE 2 : 46%-49%
TYPE 3 : 15%
TYPE 4 : LESS than 10%