QueryExpansionwithLocally-TrainedWordEmbeddings
Fernando Diaz Bhaskar Mitra NickCraswell
Microsoft
July21, 2016
1 / 22
wordembedding: discriminativelytrainedvectorrepresentation
2 / 22
L =T∑
t=1
ωxt︸︷︷︸termweight
∑y∈Vt
c
logσ(ϕ(xt) · ϕ(y))︸ ︷︷ ︸observedcontext
+∑y∈Vt
n
logσ(−ϕ(xt) · ϕ(y))︸ ︷︷ ︸negativecontext
3 / 22
ωxt needstoreflecttheimportanceofthetermatevaluationtime.
4 / 22
T∑t=1
ωxt=w ∝ p(w|C)
5 / 22
whattermsareimportantatquerytime?
6 / 22
p(w|R) probabilityofthetermintherelevantdocuments.
7 / 22
howdifferentis p(w|R) from p(w|C)?
8 / 22
KL(R, C)w = p(w|R) log p(w|R)
p(w|C)
9 / 22
KL
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
rank
10 / 22
howmuchbettercanwedoifwetrainwith∑Tt=1 ωxt ∝ p(w|R)?
11 / 22
LanguageModelScoring
score(d, q) = KL(θq, θd)
θq maximumlikelihoodquerylanguagemodelθd documentlanguagemodel
12 / 22
QueryExpansionwithWordEmbeddings
θ̃q = UUTθq
U |V| × k termembeddingmatrix
13 / 22
QueryExpansionwithWordEmbeddings
Uglobal embeddingtrainedwith p(w|C)Ulocal embeddingtrainedwith p(w|R)
14 / 22
Getting p(w|R)
p(d) =exp(−KL(θq, θd))∑d′ exp(−KL(θq, θd′))
p̃(w|R) =∑d
p(w|θd)p(d)
15 / 22
Getting p(w|R)
p(d) =exp(−KL(θq, θd))∑d′ exp(−KL(θq, θd′))
p̃(w|R) =∑d
p(w|θd)p(d)
15 / 22
Experiments
16 / 22
Data
docs words queriestrec12 469,949 438,338 150robust 528,155 665,128 250web 50,220,423 90,411,624 200giga 9,875,524 2,645,367 -wiki 3,225,743 4,726,862 -
17 / 22
Embeddings
• global• publicembeddings(GloVe, word2vec)• word2vecontargetcorpus
• local: word2vecwithdocumentssampledby p(d)
18 / 22
• ten-foldcross-validation• metric: NDCG@10
19 / 22
Results
global localwiki+giga gnews target target giga wiki
QL 50 100 200 300 300 400 400 400 400trec12 0.514 0.518 0.518 0.530 0.531 0.530 0.545 0.535 0.563* 0.523robust 0.467 0.470 0.463 0.469 0.468 0.472 0.465 0.475 0.517* 0.476web 0.216 0.227 0.229 0.230 0.232 0.218 0.216 0.234 0.236 0.258*
20 / 22
global
local
topwordsby p̃(w|R) (blue: query; red: topwordsby p(w|R))
21 / 22
Summary
• localembeddingprovidesastrongerrepresentationthanglobalembedding
• potentialimpactforothertopic-specificnaturallanguageprocessingtasks
• futurework• effectivenessimprovements• efficiencyimprovements
22 / 22