Hybrid-ε-greedy for Mobile Context- Aware Recommender System Djallel Bouneffouf, Amel Bouzeghoub &...
-
Upload
constance-bonnie-henry -
Category
Documents
-
view
222 -
download
1
Transcript of Hybrid-ε-greedy for Mobile Context- Aware Recommender System Djallel Bouneffouf, Amel Bouzeghoub &...
Hybrid-ε-greedy for Mobile Context-Aware Recommender System
Djallel Bouneffouf, Amel Bouzeghoub & Alda Lopes GançarskiInstitut Télécom, Télécom SudParis,
France
1
OUTLINE1. Introduction
2. State of the art
3. Proposition
4. Experimental evaluation
5. Conclusion
2
OUTLINE1. Introduction
2. State of the art
3. Proposition
4. Experimental evaluation
5. Conclusion
3
• Software editor
• Access and navigation into the corporate data
www.nomalys.com4
5
MOBILE INFORMATION SYSTEMSContext
MOBILE INFORMATION SYSTEMSContext
6
Context-based Recommender
System
To reduce search and
navigation time
To assist users in finding
information
PROBLEMS IN CONTEXT-BASED RECOMMENDER SYSTEM
7
USER
Contextual Recommender System algorithm:
•Selects item(s) to show•Gets feedback (click, time spent,..) •Refine the models
•Repeats (large number of times) with an Optimization of metrics of interest(Total number of clicks, Total rewards,…)
•How to recommend information to users taking into account their surrounding environment (location, time, near people)?
•How to follow the evolution of user’s interest?
Item InventoryArticles, web page,
documents, …Context
location, time, …
OUTLINE1. Introduction
2. State of the art
3. Proposition
4. Experimental evaluation
5. Conclusion
8
USER OR EXPERT SPECIFICATION
Constraints Laborious Not a dynamic system Not a personalized system
Advantage Context management
Reference [Panayiotou, 2006]
[Bila, 2008]
[Bellotti, 2008]
[Dobson, 2005]
[Lakshmish, 2009]
[Alexandre de Spindler, 2006]
[Mieczysław , 2009]
[Wei , 2010]
[Lihong, 2010]
9
CONTENT-BASED AND COLLABORATIVE FILTERING
Dataset
SituationsSituationsReward
Action
Social group
MeetingHome Drive
Office
Constraints Cold start problem Slow training
Reference [Panayiotou , 2006]
[Bila, 2008]
[Bellotti, 2008]
[Dobson, 2005]
[Lakshmish, 2009]
[Alexandre de Spindler, 2006]
[Mieczysław ,2009]
[Wei ,2010]
[Lihong, 2010]
Advantage Context
management Automatic process
10
MACHINE LEARNING
- The greedy strategy only exploitation;
- The ε-greedy strategy adds some random action.
Advantage Solve cold start
problem Follow the evolution
of user interest
Constraints No context
management Slow training
Reference [Panayiotou ,2006]
[Bellotti,2008]
[Bila,2008]
[Dobson,2005]
[Lakshmish, 2009]
[Alexandre de Spindler,2006]
[Mieczysła,2009]
[Wei ,2010]
[Lihong,2010]
Reinforcement learning
11
Documents D1 D2 D3 D4 D5 D6 D8 D9 D10
Displays 12 12 12 12
Number of Clicks
7 5 2 1
ExplorationExploitation
1 212
1
1
mean= 0.48
mean= 0.79
State of the art
Learning Profile
The user or The expert specificati-on
Content and Collaborative filtering Reinforcement learning
Reference [Panayiotou ,2006]
[Bila, 2008]
[Bellotti, 2008]
[Dobson, 2005]
[Lakshmish, 2009]
[Alexandre de Spindler, 2006]
[Mieczysła, 2009]
[Wei ,2010]
[Lihong ,2010]
Context management
+ + + + + +- - -
SemanticContext representation
+ + + +- - - - -
Content-based- - + + - - - - -
Automatic process
- -+ + + + + + +
Follow the evolution of user interest
- - - - - -+ + +
Solve the cold start
- - - - - -+ + +
12
OUTLINE1. Introduction
2. State of the art
3. Proposition
4. Experimental evaluation
5. Conclusion
13
MULTI-ARMED BANDITS (MAB)
A (basic) MAB problem has:
• A set D of possibilities (documents)
• A CTR(d) [0,1] of expected rewards for each ∈ d∈D
• In each round: algorithm picks document d∈D based on past history
• Reward: independent sample in [0,1] with expectation CTR (d)
• Classical setting that models exploration/exploitation trade-off
14
Documents D1 D2 D3 D4 D5 D6 D8 D9 D10
CTR 0.6 0.4 0.3 0.5
CONTEXTUAL BANDITS
X is a set of situations,
D is a set of arms,
CTR: X x D [0,1] expected rewards
• Situation x ∈ X occurs
• In each round:
• Algorithm picks arm d ∈ D• Rewards: independent sample in [0,1] with expectation CTR(x, d)
15
x1
x2
x3
Documents D1 D2 D3 D4 D5 D6 D8 D9 D10
CTR 0.2 0.4 0.3 0.4
Documents D1 D2 D3 D4 D5 D6 D8 D9 D10
CTR 0.6 0.4 0.3 0.5
Documents D1 D2 D3 D4 D5 D6 D8 D9 D10
CTR 0.2 0.1 0.3 0.7
SituationsSituationsMeetingHome
DriveOffice 1 2
GET SITUATION FROM CONTEXTSENSING
Mon Oct 3 12:10:00
2011
GPS "38.868143, 2.3484122"
NATIXIS
16
GET SITUATION FROM CONTEXT THINKING ABSTRACTION
Mon Oct 3 12:10:00
2011
GPS "38.868143, 2.3484122"
NATIXIS
TimeOntology
Location Ontology
Social Ontology
situation time location social
x1 workday Paris Bank
17
GET SITUATION FROM CONTEXT RETRIEVING THE RELEVANT SITUATION
IDS Users Time Place Client1 Paul Workday Paris BNP
2 Fabrice Holyday Evry MGET
3 Paul Workday Gentilly AMUNDI
IDS Users Time Place Client
1 Paul Workday Paris NATIXIS RetrieveSituation IDS Users Time Place Client
1 Paul Workday Paris BNP
Location Ontology
Time Ontology
Social Ontology
SELECT DOCUMENTS
19
Documents d1 d2 d3 d4 d5 d6 d8 d9 d10
CTR 0.6 0.2 0.4
Hybrid-ε-greedy
argmaxd(CTR(d)) p(1-ε)
dt =
Random(D) p(ε)
CBR-ε-greedy
argmaxd(CTR(d)) p(1-ε)
dt = CBF (d) p(z)
Random(D) p(k)
ε is the probability of exploration
• CBF (d) gives documents similar to document d
ε = z+k
• Content-Based filtering (CBF)
OUTLINE1. Introduction
2. State of the art
3. Proposition
4. Experimental evaluation
5. Conclusion
20
IDS Users Time Place Client
1 Paul 11/05/2011 Paris AFNOR
2 Fabrice 15/05/2011 Evry MGET
3 Paul 19/05/2011 Gentilly AMUNDI
IdDoc IDS Click Time Interest Documents
1 1 2 2 min 3/5 Demand
2 1 3 3 min 1/5 Contact
3 2 1 50 sec null Person
Diary navigation entries
Diary situation entries
21
•Data from Nomalys
•16 286 diary situations
•342 725 navigation entries
EXPERIMENTAL DATASETS
RECOMMEND DOCUMENTS
ε variation on learning ε variation on deployment
ε- Variation
22ε is the probability of exploration
CT
R
CT
R
argmaxd(CTR(d)) p(1-ε)
dt =
Random(D) p(ε)
RECOMMEND DOCUMENTS
Data size on learning Data size on deployment
Data size variation
23
CT
R
C
TR
Data sizeData size
CONCLUSION
Our experiments yield to the conclusion that:• Considering the user’s context for the exploration/exploitation strategy significantly increases the performance of the recommender system.
In the future:• We plan to investigate methods that automatically learn the optimal exploitation and exploration trade-o . ff
24