Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)

23
Query Expansion with Locally-Trained Word Embeddings Fernando Diaz Bhaskar Mitra Nick Craswell Microsoft July 21, 2016 1 / 22

Transcript of Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)

QueryExpansionwithLocally-TrainedWordEmbeddings

Fernando Diaz Bhaskar Mitra NickCraswell

Microsoft

July21, 2016

1 / 22

wordembedding: discriminativelytrainedvectorrepresentation

2 / 22

L =T∑

t=1

ωxt︸︷︷︸termweight

∑y∈Vt

c

logσ(ϕ(xt) · ϕ(y))︸ ︷︷ ︸observedcontext

+∑y∈Vt

n

logσ(−ϕ(xt) · ϕ(y))︸ ︷︷ ︸negativecontext

3 / 22

ωxt needstoreflecttheimportanceofthetermatevaluationtime.

4 / 22

T∑t=1

ωxt=w ∝ p(w|C)

5 / 22

whattermsareimportantatquerytime?

6 / 22

p(w|R) probabilityofthetermintherelevantdocuments.

7 / 22

howdifferentis p(w|R) from p(w|C)?

8 / 22

KL(R, C)w = p(w|R) log p(w|R)

p(w|C)

9 / 22

KL

0.00

0.05

0.10

0.15

0.20

0.25

0.30

0.35

rank

10 / 22

howmuchbettercanwedoifwetrainwith∑Tt=1 ωxt ∝ p(w|R)?

11 / 22

LanguageModelScoring

score(d, q) = KL(θq, θd)

θq maximumlikelihoodquerylanguagemodelθd documentlanguagemodel

12 / 22

QueryExpansionwithWordEmbeddings

θ̃q = UUTθq

U |V| × k termembeddingmatrix

13 / 22

QueryExpansionwithWordEmbeddings

Uglobal embeddingtrainedwith p(w|C)Ulocal embeddingtrainedwith p(w|R)

14 / 22

Getting p(w|R)

p(d) =exp(−KL(θq, θd))∑d′ exp(−KL(θq, θd′))

p̃(w|R) =∑d

p(w|θd)p(d)

15 / 22

Getting p(w|R)

p(d) =exp(−KL(θq, θd))∑d′ exp(−KL(θq, θd′))

p̃(w|R) =∑d

p(w|θd)p(d)

15 / 22

Experiments

16 / 22

Data

docs words queriestrec12 469,949 438,338 150robust 528,155 665,128 250web 50,220,423 90,411,624 200giga 9,875,524 2,645,367 -wiki 3,225,743 4,726,862 -

17 / 22

Embeddings

• global• publicembeddings(GloVe, word2vec)• word2vecontargetcorpus

• local: word2vecwithdocumentssampledby p(d)

18 / 22

• ten-foldcross-validation• metric: NDCG@10

19 / 22

Results

global localwiki+giga gnews target target giga wiki

QL 50 100 200 300 300 400 400 400 400trec12 0.514 0.518 0.518 0.530 0.531 0.530 0.545 0.535 0.563* 0.523robust 0.467 0.470 0.463 0.469 0.468 0.472 0.465 0.475 0.517* 0.476web 0.216 0.227 0.229 0.230 0.232 0.218 0.216 0.234 0.236 0.258*

20 / 22

global

local

topwordsby p̃(w|R) (blue: query; red: topwordsby p(w|R))

21 / 22

Summary

• localembeddingprovidesastrongerrepresentationthanglobalembedding

• potentialimpactforothertopic-specificnaturallanguageprocessingtasks

• futurework• effectivenessimprovements• efficiencyimprovements

22 / 22