English Premier League Managers
Data Exploration
By Ifeoma Egbogah in Football Bayes R
March 29, 2023
Hi, I'm the here-bot cat!
Use me to find your way in your website.
- Here I am:
- content/blog/English-Premier-League-Ranking-Series/02-Data-Exploration/Data-Exploration.md
- Here is my R Markdown source file:
- blog/English-Premier-League-Ranking-Series/02-Data-Exploration/Data-Exploration.Rmd
You'll want to edit this file, then re-knit to see the changes take effect in your site preview.
To remove me, delete this line inside that file: {{< here >}}
- My content section is:
- blog
- My layout is:
- single-series
What in the world!!!
I recently came across a post that ranked English Premier League (EPL) managers based on their raw winning percentages. I felt that this approach was unfair to some exceptional managers whose winning averages may not accurately reflect their capabilities due to limited opportunities to manage games. For example, comparing a manager who oversaw 16 wins out of 80 games to one with 70 wins out of 350 games, both with a winning average of 20%, would be unfair.
To address this, I decided to use Bayesian inference to determine the best manager in the Premier League, hoping that it would be Arsene Wenger, as I am a devoted Arsenal fan. Fingers crossed.
As a first step, I loaded the relevant data and packages into R, with the data obtained from the Premier League website (www.premierleague.com). From the league’s inception in 1992 to 2018, a total of 219 managers made their debut in the EPL.
Managers | Clubs | Total Games Managed | Games Won | Games Drawn | Games Lost | Num Of Pl Seasons | Nationality | Date Of Birth | Pl Debut | Debut Match |
---|---|---|---|---|---|---|---|---|---|---|
Eddie Howe | Bournemouth | 114 | 34 | 30 | 50 | 3 | England | 1977-11-29 | 2015-08-08 | Lost |
Brue Rioch | Arsenal | 38 | 17 | 12 | 9 | 1 | Scotland | 1947-09-06 | 1995-08-20 | Drawn |
Arsene Wenger | Arsenal | 828 | 476 | 199 | 153 | 22 | France | 1949-10-22 | 1996-10-12 | Won |
Stewart Houston | Arsenal | 19 | 7 | 4 | 8 | 2 | Scotland | 1949-08-20 | 1995-02-21 | Won |
David O’Leary | Aston Villa | 343 | 148 | 82 | 104 | 7 | England | 1958-05-02 | 1998-10-01 | na |
I spy?
Managers' Age on Debut
Let’s start by getting to know the data and see what information we can glean from it. This data covers from 1992 to 2018. I begin by calculating the ages of the managers and the mean age at which managers make a debut in the league. The youngest managers in the league were Chris Coleman and Attilio Lombardo while the oldest was Dick Advocaat, making a debut at ages 32 and 67 years respectively. The mean age was 45 years.
Table: Table 1: Youngest and Oldest Managers
Managers | Age |
---|---|
Attilio Lombardo | 32 |
Chris Coleman | 32 |
Dick Advocaat | 67 |
epl <- epl%>%
mutate(age = year(as.period(interval((date_of_birth), (pl_debut)))))
summary_epl <- epl%>%
summarise(mean_age = mean(age),
youngest = min(age),
oldest = max(age))
age_table <- epl%>%
select(managers, age)%>%
filter(age %in% c(32, 67))
knitr::kable(age_table, col.names = c("Managers", "Age"), caption = "Youngest and Oldest Managers")
Table: Table 2: Youngest and Oldest Managers
Managers | Age |
---|---|
Attilio Lombardo | 32 |
Chris Coleman | 32 |
Dick Advocaat | 67 |
Lets get a better view of what the age distribution looks like by plotting a histogram.
epl%>%
ggplot(aes(age)) +
geom_histogram(binwidth = 2, fill = scales::muted("darkorchid"), alpha = 0.8) +
labs(x = "Age",
y = "Count",
title = "Distribution of Managers' Age on Premier League Debut") +
theme(plot.title = element_text(hjust = 0.5, size = 50),
axis.title = element_text(size = 30),
axis.text = element_text(size = 25))
Where are they from?
One thing the English League is known for is the wide range in the nationality of not only the players but also the managers of the clubs. So, What is the most common nationality of premier league managers? Let’s find out.
epl %>%
count(nationality, sort=TRUE) %>%
# mutate(nationality = fct_reorder(nationality, n))%>%
ggplot(aes(nationality, n)) +
geom_col(aes(fill = nationality)) +
labs(x = "Country of Origin",
y = "Count",
title = "Nationality of Premier League Managers") +
scale_fill_scico_d(palette = "bam") +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5, size = 50),
axis.title = element_text(size = 30),
axis.text = element_text(size = 25)) +
coord_flip()
Err… it looks a bit scattered let’s arrange it in descending order based on there percentage so as to get a clearer picture.
Origin <- epl%>%
count(nationality, sort=TRUE) %>%
mutate(percent = (n/sum(n))) %>%
mutate(country = fct_reorder(nationality, percent))%>%
ggplot(aes(country, percent)) +
geom_col(aes(fill = country)) +
labs(x = "Country of Origin",
y = "Percentage",
title = "Nationality for Premier League Managers") +
scale_fill_scico_d(palette = "bam") +
scale_y_continuous(labels = label_percent()) +
theme(legend.position = "none",
plot.title = element_text(hjust = 0.5, size = 50),
axis.title = element_text(size = 30),
axis.text = element_text(size = 25)) +
coord_flip()
Origin
That’s much better. Here we can see that most managers in the league are English with 51%, followed by the Scots at 13%. Uruguay, United States, Switzerland, Sweden, Israel, Denmark, Chile and Brazil have each had one manager from these countries debuting in the premier league.
Who Runs the Top Flight Clubs?
A popular opinion among sports pundits is that English managers haven’t been given the chance to over see the top flight clubs in the league. How true is this? Let’s find out.
all_club <- epl%>%
count(nationality, clubs, sort = TRUE)%>%
group_by(clubs)%>%
mutate(total = sum(n))%>%
ggplot(aes(fct_reorder(clubs, total), n)) +
geom_col(aes(fill= nationality)) +
labs(x = "Country of Origin",
y = "Counts",
fill = "Nationality",
title = "Nationality of the Premier League Managers for Each Clubs") +
scale_fill_scico_d(palette = "bam") +
scale_y_continuous(breaks = seq(2, 12, 2)) +
coord_flip() +
theme(plot.title = element_text(hjust = 0.5, size = 50),
axis.title = element_text(size = 30),
axis.text = element_text(size = 25),
legend.title = element_text(size = 40),
legend.key.size = unit(3, "cm"),
legend.text = element_text(size = 20))
top_club <- epl%>%
filter(clubs %in% c("Arsenal", "Chelsea", "Manchester United", "Manchester City", "Liverpool", "Tottenham Hotspur"))%>%
count(nationality, clubs, sort = TRUE)%>%
group_by(clubs)%>%
mutate(total = sum(n))%>%
ggplot(aes(fct_reorder(clubs, total), n)) +
geom_col(aes(fill= nationality)) +
labs(x = "Country of Origin",
y = "Counts",
fill = "Nationality",
title = "Nationality of the Premier League Managers for the Top Flight Clubs") +
scale_fill_scico_d(palette = "bam") +
scale_y_continuous(breaks = seq(2, 12, 2)) +
coord_flip() +
theme(plot.title = element_text(hjust = 0.5, size = 40),
axis.title = element_text(size = 30),
axis.text = element_text(size = 22),
legend.title = element_text(size = 35),
legend.key.size = unit(2, "cm"),
legend.text = element_text(size = 25))
all_club
top_club
The bar chart above shows that all the top 6 clubs except Manchester United and Arsenal have had an English Manager in charge. Manchester united not having an English manager since the creation of the league is partially because Sir Alex Ferguson (who debuted at the club in 1986) was the manager of the club before the league was created and he spent over 20 years at the helm of affairs in the club before handing over at the end of 2012/13 season.
Match Result on Premier League Debut
David O’Leary and Martin O’Neill had no information on their English premier league debut. So I’ll filter them out. Most managers lost on their first English premier league debut (42.5%).
debut <- epl %>%
filter(debut_match != "na") %>%
count(debut_match, sort=TRUE) %>%
mutate(percent = (n/sum(n)))%>%
ggplot(aes(percent, debut_match)) +
geom_col(fill = scales::muted("darkorchid"), alpha = 0.8) +
scale_x_continuous(labels = scales::label_percent())+
labs(x = "Percent (%)",
y = "Debut Match",
title = "Percentage of Managers' Debut Match Win, Loss or Draw",
subtitle = "They say the English League is very intense and quite competitive.\nFamous managers like Arsene Wenger, Josep Guardiola, Jose Mourinho won their Premier League debut match.\nSir Alex lost his debut macth while Jurgen Klopp drew his debut match.") +
theme(plot.title = element_text(hjust = 0.5, size = 50),
plot.subtitle = element_text(face = "italic", size = 30, colour = "grey"),
legend.text = element_text(size = 15),
axis.title = element_text(size = 30),
axis.text = element_text(size = 25))
debut
Out of the 112 English managers in the league 50 lost their debut matches. So far no Argentine, German, Portuguese or Spanish managers have lost their debut match.
debut_nat <- epl %>%
filter(debut_match != "na") %>%
count(debut_match, nationality, sort=TRUE)%>%
group_by(nationality)%>%
mutate(total = sum(n))%>%
ggplot(aes(fct_reorder(nationality, total), n)) +
geom_col(aes(fill = debut_match)) +
coord_flip() +
scale_fill_scico_d(palette = "bam", direction = -1) +
scale_y_continuous(breaks = seq(0, 130, 20)) +
labs(x = "Nationality of Managers",
y = "No of Matches",
title = "Number of Matches Won, Drawn, or Lost by Managers on Debut Day",
fill = "Match Result") +
theme(plot.title = element_text(hjust = 0.5, size = 50),
axis.title = element_text(size = 30),
axis.text = element_text(size = 25),
legend.title = element_text(size = 35),
legend.key.size = unit(3, "cm"),
legend.text = element_text(size = 20))
debut_nat
Out of the 112 English managers in the league 50 lost their debut matches. So far no Argentine, German, Portuguese or Spanish managers have lost their debut match.
Now that we have explored our data, we will begin analyzing in the next post.