Sales Diversity

by Recommending

Users to Items

Novelty and

Popularity Bias

A few of the most popular items concentrate high sales volumes, while the rest -the long tail- have moderate ones

Promoting sales in the long tail has benefits:

- For the business: make the most of the catalog, niche markets $\rightarrow$
**Sales diversity** - For the user: less obvious, unexpected recommendations $\rightarrow$
**Novelty**

Personalized recommendations could help us promote items in the Long Tail

**but...**

Effect

Personalized recommendations have a bias towards recommending popular items

It has a negative effect in terms of novelty and sales diversity

**This research aims at alleviating this popularity bias**

Users to Items

We transpose the recommendation task by selecting which users each item should be recommended to, instead of the other way around

We ignore initially the popularity bias of the items, focusing on the users the item should be recommended to

We consider two approaches: **inverted neighborhoods** and a **probabilistic reformulation layer**

Standard recommendation task

rating matrix $\mathcal{R}$

\[s: \mathcal{U} \times \mathcal{I} \rightarrow \mathbb{R}\] \[R_u \in \mathcal{I} \times \mathcal{I} \times \ldots\]Inverted recommendation task

rating matrix $\tilde{\mathcal{R}} = \mathcal{R}^t$

\[\tilde{s}: \mathcal{I} \times \mathcal{U} \rightarrow \mathbb{R}\] \[\tilde{R}_i \in \mathcal{U} \times \mathcal{U} \times \ldots\]Some recommendation algorithms are symmetric: the transposed task is fully equivalent to the original task

Example: implicit matrix factorization

\[ \tilde{s}_{iMF}(i, u) = q_i \, p_u^t = p_u \, q_i^t = s_{iMF}(u, i) \]For other algorithms swapping the role of users and items is not indifferent and results in a new scoring function

**This is the case of the nearest neighbors algorithms**

Neighbors

Standard $K$-nearest neighbors recommendation

\begin{align} s_{UB}(u, i) &= \sum_{v \in \mathcal{U}} \mathbf{1}_{v \in N_K(u)} \; sim(u, v) \; r_{v,i} \\ s_{IB}(u, i) &= \sum_{j \in \mathcal{I}_u} \mathbf{1}_{i \in N_K(j)} \; sim(i, j) \; r_{u,j} \end{align}Inverted $K$-nearest neighbors recommendation

\begin{align} \tilde{s}_{UB}(i, u) &= \sum_{j \in \mathcal{I}} \mathbf{1}_{j \in N_K(i)} \; sim(i, j) \; \tilde{r}_{j,u} \\ \tilde{s}_{IB}(i, u) &= \sum_{v \in \mathcal{U}_i} \mathbf{1}_{u \in N_K(v)} \; sim(u, v) \; \tilde{r}_{i, v} \end{align}$\mathbf{1}_{v \in N_K(u)} \rightarrow$ *Use the preferences of my neighbors*

$\mathbf{1}_{u \in N_K(v)} \rightarrow$ *Use the preferences of the users I am a neighbor of*

The inverted criterion can be reformulated as $\mathbf{1}_{u \in N_K(v)} = \mathbf{1}_{v \in N^{-1}_K(u)} \rightarrow$

where $N^{-1}_K(u) = \left\lbrace v \in \mathcal{U} \;:\; u \in N_K(v) \right\rbrace$ is

the **inverted neighborhood** of $u$

Properties

- Variable neighborhood size
**All users/items will appear in the same number of neighborhoods**($K$)

Consequences

- User-based: flattens the influence power of users
- Item-based: more chances to rare items to be recommended

An analysis of the neighborhoods of Netflix Prize data and Million Song dataset shows

- Neighborhood concentration: a few users/items appear in many neighborhoods
- Popularity bias: in the item-based algorithm, neighborhoods are composed of very popular items

**The inverted neighborhoods eliminate these biases**

Reformulation Layer

Probabilistic interpretation \[p(i\,|\,u)\]

Inverted recommendation \[p(u\,|\,i)\]

We take the scoring function $s(u,i)$

as a good estimator of the probability $p(u, i)$
\[
p(u\,|\,i;s) \sim \frac{s(u, i)}{\sum_v s(v, i)}
\]

What about $p(i\,|\,u)$?

Bayes' rule \[ p(i\,|\,u;s) = \frac{p(u\,|\,i;s)\;p(i;s)}{\sum_j p(u\,|\,j;s)\;p(j;s)} \]

Estimating the item prior \[ p(i;s) \sim \frac{\sum_u s(u,i)}{\sum_j \sum_u s(u,j)} \]

**The item prior captures the popularity**

**Idea:** smooth the prior by means of an entropic regularization
\[
p(i;s, \alpha) \sim \frac{\left(\sum_u s(u,i)\right)^{1 - \alpha}}{\sum_j \left(\sum_u s(u,j)\right)^{1 - \alpha}}
\]

$\alpha$ controls the smoothing applied

\begin{align} \alpha = 0 & \Rightarrow p(i; s, \alpha) = p(i; s) \\ \alpha = 1 & \Rightarrow p(i; s, \alpha) = \frac{1}{|\mathcal{I}|} \end{align}

ratings | users | items | |
---|---|---|---|

Netflix | 100M | 480K | 18K |

MSD | 48M | 1.1M | 380K |

- Accuracy: precision
- Novelty: EPC (Vargas and Castells 2011)
- Sales diversity: Gini Index (complement)
- All metrics evaluated at cutoff 10
- Higher is better

Comparing inverted neighborhoods with standard neighborhoods for different neighborhood sizes

Comparing the probabilistic reformulation with a direct novelty optimization $s_{NR}$

\[ s_{NR}(u, i) = (1 - \lambda)\; s(u, i) + \lambda \; nov(i) \]- We address the popularity bias in recommendations by recommending users to items
- A first method inverts the role of users and items for nearest neighbors
- A second method develops a probabilistic interpretation to isolate and then smooth the popularity bias
- Experiments on two datasets show the effectiveness of our approaches
- Inverted neighborhood: clear improvements over the standard neighborhoods in novelty and sales diversity
- Probabilistic reformulation: competitive against a direct optimization for novelty, outperforms in sales diversity

your attention!