Coverage, Redundancy and Size-Awareness in Genre Diversity for
Recommender Systems

Saúl Vargas, Linas Baltrunas,
Alexandros Karatzoglou and Pablo Castells

RecSys 2014 Silicon Valley

UAM TID

Diversity in
Recommendations

(today intra-list diversity)

Silver Linings The Bridge over River Kwai Gone by the Wind
I miss Sci-Fi
Star Wars Star Trek Ender's Game
Too much Sci-Fi!
  • Being accurate is not enough (Herlocker et al. 2004, McNee et al 2006)
  • Need to address the user's varied tastes
  • Need to consider the user's preference for diverse recommendations
Blade Runner Love Actually Saving Private Ryan
That's better!
Lord Kelvin
“If you can not measure it,
you can not improve it.” Lord Kelvin

So we know what we mean by diversity... Now, how do we measure it?

We use genres to measure it

“Genre: a category of artistic, musical, or literary composition characterized by a particular style, form, or content.”
Merriam-Webster Dictionary
  • define a conventional style
  • represent the different tastes of users
  • available in most online media catalogs for movies, literature, music, etc.
  • perceivable in recommendations

Genres present some interesting properties

Generality

drama
Drama
western
Western

Overlap

Overlap between top 5 genres

Trying to Measure
Genre Diversity

Pair-wise Framework

Ziegler et al. (2005): sum of similarities between pairs or recommended items.

\[ \text{ILS}(R) = \sum_{i,j \in R} sim(i, j) \]
Silver Linings
Comedy
Romance
The Bridge over River Kwai
Adventure
War
Gone by the Wind
Biography
History

...but I was expecting Sci-Fi!

Intent-aware Framework

Agrawal et al. (2009): sum of the weighted marginal relevance of each genre

\[ \text{M-IA}(R) = \sum_g p(g) \; \text{M}(R|g) \]
Wild Wild West
Western
Comedy
Cowboys and Aliens
Western
Sci-Fi
The Good, the Bad and the Ugly
Western
Adventure

Good in Comedy

Good in Sci-fi

Good in Adventure

...but too much Western

Proportionality Framework

Dang and Croft (2012): need for covering each genre with a number of items proportional to the interest of the user.

\[ \text{#present} \geq \text{#expected} \]

Suffers similar problems as the intent-aware framework

The state of the art diversity frameworks seem to be inadequate to model diversity in recommendations using genres

Which properties should a good diversity framework fulfill?

Coverage, redundancy and size-awareness

Could we define a framework to satisfy them?

Yes, the Binomial Diversity framework

Properties of
Genre Diversity

Coverage

Since most users enjoy items from a variety of genres, it is important that the recommendation list covers as many of them as possible.

The coverage should be proportional: the personalized importance of each genre is not equal.

Redundancy

It is as important to present items that cover a certain genre as to present other items that do not cover it.

Wild Wild West
Western
Comedy
Cowboys and Aliens
Western
Sci-Fi
The Good, the Bad and the Ugly
Western
Adventure

Four genres... but Western is definitely redundant

Size-awareness

Mobile

Mobile setting: limited screen real estate to show recommendations requires a careful selection of what to display in a list

The available list size should be taken into account by the model

Not explored in prior work on search or recommendation diversity

The Binomial Diversity Framework

Rationale

Random recommendation as naturally diverse but inaccurate algorithm

The selection of genres in a random recommendation can be modeled with a Binomial Distribution

We take the random recommendation as a model for the occurrence of genres in a good genre-based diverse recommendation

We propose the Binomial Diversity framework to assess and enhance genre diversity

Genre selection as
Binomial Distribution

coin toss
\begin{align} P(X_g = k_g) = \binom{N}{k_g} p_g^k (1 - p_g)^{N - k_g} \end{align}

$k_g$: times a genre $g$ appears in the recommendation

$N$: recommendation list size

$p_g$: probability of selecting genre $g$

Genre probability

User interest $p''_g$ as the fraction of items that contain the genre in user preferences

Generality $p'_g$ as the fraction of items that contain the genre all users' preferences

We combine them

\[ p_g = (1 - \alpha) \; p'_g + \alpha \; p''_g \]

Now we have all the ingredients... let's cook the metrics!

Binomial coverage

\begin{align} Coverage(R) \end{align}

A genre is not covered, how serious is it?

Penalize lack of coverage by the probability of no selecting the genre in the recommendation

\[ P(X_g = 0) = (1 - p_g)^N \]

Binomial redundancy

\begin{align} NonRed(R) \end{align}

Two occurrences of the same genre are already redundant, but how big is the effect?

Penalize redundancy by the probability of observing this or higher number of items with the genre in the recommendation

\[ P(X_g \geq k_g | X_g = 0) \]
patience

Binomial diversity

Combining coverage and redundancy
in a single diversity metric

\begin{align} BinomDiv(R) = Coverage(R) \cdot NonRed(R) \end{align}

This metric fulfills all our requirements!

Experiments

Setup

Netflix plus IMDb

Netflix Prize + IMDb: 83M ratings, 480K users, 9.3K items, 28 genres

Experiment 1

Evaluate some state of the art algorithms in terms of accuracy (nDCG@20) and diversity (BinomDiv@20) with different values of the generality-interest trade-off $\alpha$ parameter.

Experiment 2

For the iMF baseline, apply greedy re-ranking approaches that directly optimize each diversity framework and compare the effects of measuring the optimizations with the metrics of the other frameworks

relative improvement w.r.t. iMF

Conclusions

  • Assessment and enhancement of diversity in recommendations using genres.
  • Coverage, redundancy and recommendation list size-awareness as requirements for genre diversity
  • The Binomial Diversity framework to satisfies these properties.
  • Experiments to illustrate our framework and compare it with alternative approaches.

Thanks! Questions?

Why coverage, redundancy and size-awareness?

Experience: we have some experience working with the previous frameworks of the state of the art, we know their properties and limitations

Common sense: they look reasonable to us

Experiment 2B

For each direct optimization for the iMF baseline, measure the average number of genres of the items in the re-rankings.

average number of genes of the recommended items

Experiment 3

Apply direct optimizations of the Binomial Diversity framework to the iMF algorithm with different recommendation list sizes and measure them with different recommendation list sizes.