Improving
Sales Diversity
by Recommending
Users to Items

Saúl Vargas and Pablo Castells

RecSys 2014 Silicon Valley

UAM

Sales Diversity,
Novelty and
Popularity Bias

The Long Tail

The Long Tail

A few of the most popular items concentrate high sales volumes, while the rest -the long tail- have moderate ones

Promoting sales in the long tail has benefits:

  • For the business: make the most of the catalog, niche markets $\rightarrow$ Sales diversity
  • For the user: less obvious, unexpected recommendations $\rightarrow$ Novelty

Personalized recommendations could help us promote items in the Long Tail

but...

The Long Tail

The Harry Potter
Effect

Personalized recommendations have a bias towards recommending popular items

It has a negative effect in terms of novelty and sales diversity

This research aims at alleviating this popularity bias

Recommending
Users to Items

Idea

We transpose the recommendation task by selecting which users each item should be recommended to, instead of the other way around

We ignore initially the popularity bias of the items, focusing on the users the item should be recommended to

We consider two approaches: inverted neighborhoods and a probabilistic reformulation layer

Formulation

Standard recommendation task

rating matrix $\mathcal{R}$

\[s: \mathcal{U} \times \mathcal{I} \rightarrow \mathbb{R}\] \[R_u \in \mathcal{I} \times \mathcal{I} \times \ldots\]

Inverted recommendation task

rating matrix $\tilde{\mathcal{R}} = \mathcal{R}^t$

\[\tilde{s}: \mathcal{I} \times \mathcal{U} \rightarrow \mathbb{R}\] \[\tilde{R}_i \in \mathcal{U} \times \mathcal{U} \times \ldots\]

Some recommendation algorithms are symmetric: the transposed task is fully equivalent to the original task

Example: implicit matrix factorization

\[ \tilde{s}_{iMF}(i, u) = q_i \, p_u^t = p_u \, q_i^t = s_{iMF}(u, i) \]

For other algorithms swapping the role of users and items is not indifferent and results in a new scoring function

This is the case of the nearest neighbors algorithms

Inverted Nearest
Neighbors

Standard $K$-nearest neighbors recommendation

\begin{align} s_{UB}(u, i) &= \sum_{v \in \mathcal{U}} \mathbf{1}_{v \in N_K(u)} \; sim(u, v) \; r_{v,i} \\ s_{IB}(u, i) &= \sum_{j \in \mathcal{I}_u} \mathbf{1}_{i \in N_K(j)} \; sim(i, j) \; r_{u,j} \end{align}

Inverted $K$-nearest neighbors recommendation

\begin{align} \tilde{s}_{UB}(i, u) &= \sum_{j \in \mathcal{I}} \mathbf{1}_{j \in N_K(i)} \; sim(i, j) \; \tilde{r}_{j,u} \\ \tilde{s}_{IB}(i, u) &= \sum_{v \in \mathcal{U}_i} \mathbf{1}_{u \in N_K(v)} \; sim(u, v) \; \tilde{r}_{i, v} \end{align}

Inverted Neighborhoods

$\mathbf{1}_{v \in N_K(u)} \rightarrow$ Use the preferences of my neighbors

$\mathbf{1}_{u \in N_K(v)} \rightarrow$ Use the preferences of the users
I am a neighbor of

The inverted criterion can be reformulated as $\mathbf{1}_{u \in N_K(v)} = \mathbf{1}_{v \in N^{-1}_K(u)} \rightarrow$

where $N^{-1}_K(u) = \left\lbrace v \in \mathcal{U} \;:\; u \in N_K(v) \right\rbrace$ is
the inverted neighborhood of $u$

An Example ($K = 2$)

example of neighborhoods

Properties and Consecuences

Properties

  • Variable neighborhood size
  • All users/items will appear in the same number of neighborhoods ($K$)

Consequences

  • User-based: flattens the influence power of users
  • Item-based: more chances to rare items to be recommended

Neighborhood Biases

An analysis of the neighborhoods of Netflix Prize data and Million Song dataset shows

  • Neighborhood concentration: a few users/items appear in many neighborhoods
  • Popularity bias: in the item-based algorithm, neighborhoods are composed of very popular items

The inverted neighborhoods eliminate these biases

Probabilistic
Reformulation Layer

Probabilistic interpretation \[p(i\,|\,u)\]

Inverted recommendation \[p(u\,|\,i)\]

We take the scoring function $s(u,i)$
as a good estimator of the probability $p(u, i)$ \[ p(u\,|\,i;s) \sim \frac{s(u, i)}{\sum_v s(v, i)} \]

What about $p(i\,|\,u)$?

Bayes' rule \[ p(i\,|\,u;s) = \frac{p(u\,|\,i;s)\;p(i;s)}{\sum_j p(u\,|\,j;s)\;p(j;s)} \]

Estimating the item prior \[ p(i;s) \sim \frac{\sum_u s(u,i)}{\sum_j \sum_u s(u,j)} \]

The item prior captures the popularity

Idea: smooth the prior by means of an entropic regularization \[ p(i;s, \alpha) \sim \frac{\left(\sum_u s(u,i)\right)^{1 - \alpha}}{\sum_j \left(\sum_u s(u,j)\right)^{1 - \alpha}} \]

$\alpha$ controls the smoothing applied

\begin{align} \alpha = 0 & \Rightarrow p(i; s, \alpha) = p(i; s) \\ \alpha = 1 & \Rightarrow p(i; s, \alpha) = \frac{1}{|\mathcal{I}|} \end{align}

Resulting scoring schema \[ s_{BR}(u, i) = s(u, i) \left(\sum_v s(v, i)\right)^{-\alpha} \]

Experiments

Setup

Two datasets

ratings users items
Netflix 100M 480K 18K
MSD 48M 1.1M 380K

Metrics

  • Accuracy: precision
  • Novelty: EPC (Vargas and Castells 2011)
  • Sales diversity: Gini Index (complement)
  • All metrics evaluated at cutoff 10
  • Higher is better

Experiment 1

Comparing inverted neighborhoods with standard neighborhoods for different neighborhood sizes

Experiment 1 - Netflix

Experiment 1 - Netflix

Experiment 1 - MSD

Experiment 1 - MSD

Experiment 2

Comparing the probabilistic reformulation with a direct novelty optimization $s_{NR}$

\[ s_{NR}(u, i) = (1 - \lambda)\; s(u, i) + \lambda \; nov(i) \]

Experiment 2 - Netflix

Experiment 2 - Netflix

Experiment 2 - MSD

Experiment 2 - MSD

Conclusions

  • We address the popularity bias in recommendations by recommending users to items
  • A first method inverts the role of users and items for nearest neighbors
  • A second method develops a probabilistic interpretation to isolate and then smooth the popularity bias
  • Experiments on two datasets show the effectiveness of our approaches
    • Inverted neighborhood: clear improvements over the standard neighborhoods in novelty and sales diversity
    • Probabilistic reformulation: competitive against a direct optimization for novelty, outperforms in sales diversity

Thank you for
your attention!

Questions?