Bring recommendations to users
Create an embedding vector for users and items, and then compare vector distances
If there is a positive feedback loop that reinforces a small group. For example, anime watchers watch a lot of anime, and the recommender may bias towards anime
Left side y axis represents users+user_embeddings, x axis represents items+item_embeddings. The cross section represents the dependent variable
*** Not done
Latent factor are indirect variables, not directly observed, but seen through a combination of other variables. Latent means hidden or concealed
Dot product is the element wise multiplication of two vectors, and then summing all the products
pandas.DataFrame.merge
do?
Combines data frames together along, and aligns data with a specific column
Embedding matrix is a matrix of users/items and latent factors
You use the one-hot encoded vector to pull the embeddings of 1 user. You can think of an embedding as a compressed version of the one-hot encoded vectors
Embedding
if we could use one-hot encoded vectors for the same thing?
Embeddings save a lot more memory especially if the there is high cardinality. Also, embeddings allow turning categories into continuous variables
Randomly initialized numbers
*** Not done
x[:,0]
return?
Every row of column 0 (first column)
DotProduct
class (without peeking, if possible!) and train a model with it
*** Not done
Mean squared error because we have a range of values (1,2,3,4,5)
CrossEntropy
loss with MovieLens? How would we need to change the model?
??? It wouldn’t work because it looks for a 1 or 0. You need to do categorical cross entropy
The bias centers the function in order to balance with other neurons
L2 regularization
total_loss = loss + sum(wd*(w**2))
weight = weight - lrgrad grad = grad + 2weight weight = weight - lr(grad + 2weight) weight = (1-2lr)weight - lr*grad
More neurons/weights are used, and therefore have to share features among themselves versus just 1 weight
argsort
do in PyTorch?
argsort gives you the indices of the sorted values
No, sorting the movie biases gives additional information that given a movie, the people who like that genre, may not like that movie. Whereas the overall movie rating just gives the average across people
layer.model
How to recommend things to new users who have no previous history. You can’t bootstrap them to previous knowledge, because you don’t have any
Start people at the average, or ask questions, or get meta data
They could impact systems negatively if there is a reinforcing bias
Each represent different complexities. We will flatten, and concatenate all these features before feeding into the neural network
nn.Sequential
in the CollabNN
model?
To create a small NN model
EmbeddingNN, which inherits from TabularModel