Understanding GloVe Vectors
Understanding GloVe Vectors There are many articles out there that tell you about word vectors and their uses. I will try to focus on how GloVe vectors are calculated and the underlying equations behind it. The motivation behind creating GloVe vector was that the authors wanted to create a model which utilizes the word-word co-occurrence counts and thus make efficient use of statistics. The end model is a weighted least squares model, with weights depending on the word-word co-occurrence counts. The regression equation they use is tr(wi)*w~k + bi + b~k = log(Xik) and the cost function is weighted least squares J = summition over the i,j of f(Xij)*( tr(wi)*wi + bi + bj - log(Xij))^2 Here Xij is the word-word co-occurrence counts. The equation was solved using AdaGrad optimizer in the paper. It gives out two vectors wi and w~i. The res...