Tidyverse exercises

Songs dataset: exercises

library(tidyverse) # Just to be sure it's the first thing that is done...
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.5     ✓ purrr   0.3.4
## ✓ tibble  3.1.5     ✓ dplyr   1.0.7
## ✓ tidyr   1.1.3     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

From the songs dataset:

  1. load the data.
  2. find the songs with popularity (strictly) above 95.
  3. find the songs from Kanye West.
  4. create a pivot table over artists that counts how many songs they have in the dataset and then sort the result in descending order (the first line corresponds to the artist with the most songs).
  5. create a pivot table over artists that averages the popularity of their songs and sort the results in descending order (the first line corresponds to the artist that has the most popular songs on average).
  6. in the above pivot table, the best artists are those with only one (good) song. That’s not fair! In order to take this into account, create a pivot table with the two indicators (number of songs and average popularity), filter the artists with at least 10 songs and then rank them in descending average popularity.

Movies dataset: exercises

From the movies dataset:

  1. load the data.
  2. find the movies with imdb score strictly above 8.8 (those are the great movies).
  3. find the movies from Tim Burton.
  4. compute the number of different directors in the dataset.
  5. create a pivot table over directors that counts how many films they have in the dataset and sort the result in descending order (the first line corresponds to the director with the most films).
  6. create a pivot table over directors that averages the imdb score of their films and sort the results in descending order (the first line corresponds to the director that has the most popular films on average).
  7. in the above pivot table, the best directors are those with only one (good) film That’s not fair! In order to take this into account, create a pivot table with the two indicators (number of songs and average popularity), filter the directors with at least 10 films and then rank them in descending average imdb score.
  8. create a new column that compute the earnings/budget ratio. Perform the same analysis as question 6. but on this ratio.