Anomalous Association Rules Máster Oficial en Soft Computing y Sistemas Inteligentes Universidad de...
-
date post
21-Dec-2015 -
Category
Documents
-
view
216 -
download
1
Transcript of Anomalous Association Rules Máster Oficial en Soft Computing y Sistemas Inteligentes Universidad de...
Anomalous Association RulesAnomalous Association Rules
Máster Oficial en Soft Computing y Sistemas Inteligentes
Universidad de Granada
2
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
IntroductionIntroduction
Association Rule: X YSupp(X Y) ≡ Supp(X Y) ≥ ε (5%)
Conf(X Y) = ≥ θ (80%)
frequent
confident
Applications Market basket, CRM, etc.
Supp(X)
Y)Supp(X
Find all the frequent and confident associations
3
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
IntroductionIntroduction
Problem: Thousands of rules are found.
Unmanageable for any user!There are too many spurious
associations.Possible solutions:- Subjective measures- Objective measures
The main problem is the type of knowledge an association rule represents
4
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
IntroductionIntroduction
The crucial problem is to determine which kind of events we are interested in, so that we can appropriately characterize them.
It is often more interesting to find surprising non-frequent events than frequent ones. The type of interesting events is application dependent
5
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
IntroductionIntroduction
Infrequent itemsets in intrusion detection systems
Exceptions to associations for the detection of conflicting medicine therapies
Unsual short sequences of Nucleotides in genome sequencing
Etc.
6
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
IntroductionIntroduction
Our Objective
To introduce the concept of anomalous association rule as a confident rule representing homogeneous deviations from common behavior.
7
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
Related WorkRelated Work
Suzuki, Hussain & Suzuki:
“Exception Rules”
X Y is an association rule
X I
X I is the reference rule
is the exception rule
¬ Y
I is the “Interacting” itemset
Too many exceptions
8
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
Our DefinitionOur Definition
X Y frequent and confident
X¬Y Anomalous association rule
X usually implies Y (dominant rule)
When X does not imply Y, then it usually implies A (the Anomaly)
A
X Y ¬A confident
confident
9
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
Our DefinitionOur Definition
X Y A1 Z1…
X Y A1 Z2…
X Y A2 Z3…
X Y A2 Z1…
X Y A3 Z2…
X Y A3 Z3…
X Y A Z …
X Y3A Z3
…
X Y3A Z …
X Y4A Z …
10
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
Our DefinitionOur Definition
X Y A1 Z1…
X Y A1 Z2…
X Y A2 Z3…
X Y A2 Z1…
X Y A3 Z2…
X Y A3 Z3…
X Y A Z …
X Y3A Z3
…
X Y3A Z …
X Y4A Z …
X Y
is the dominant rule
11
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
Our DefinitionOur Definition
X Y A1 Z1…
X Y A1 Z2…
X Y A2 Z3…
X Y A2 Z1…
X Y A3 Z2…
X Y A3 Z3…
X Y A Z …
X Y3A Z3
…
X Y3A Z …
X Y4A Z …
X A when ¬ Y
is the anomalous rule
12
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
Our DefinitionOur Definition
X Y A1 Z1…
X Y A1 Z2…
X Y A2 Z3…
X Y A2 Z1…
X Y A3 Z2…
X Y A3 Z3…
X Y A Z …
X Y3A Z3
…
X Y3A Z …
X Y4A Z …
some overlapping cases may appear
13
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
Our DefinitionOur Definition
If symptons-X then disease-Y
If symptons-X then disease-A when not disease-
Ydisease-A does not occur at the
same time of symptons-X and disease-Y
14
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
AlgorithmAlgorithm
Based on TBAR “Tree based association rules” Data & Knowledge Engineering (2001)
Berzal, Cubero, Marín, Serrano
15
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
A A #7#7 B B #9#9 C C #7#7 D D #8#8
Possible Items:Possible Items: A, B, C, D, E, FA, B, C, D, E, F
B B #6#6 D D #5#5 C C #6#6 D D #7#7 D D #5#5
D D #5#5D D #5#5
Algorithm (assoc. rules)Algorithm (assoc. rules)
5 inst. 5 inst.
withwith ABDABD
7 instances 7 instances
wihwih A A6 inst. with6 inst. with ABAB
5 inst. with5 inst. with ADAD
LL11
LL22
LL33
6 inst. with6 inst. with BCBC
16
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
AA#7 #7 AB#6 AC#4 AD#5 AE#3 AF#3AB#6 AC#4 AD#5 AE#3 AF#3
B B #9#9 C C #7#7 D D #8#8
Possible Items:Possible Items: A, B, C, D, E, FA, B, C, D, E, F
Algorithm (anomalous rules)Algorithm (anomalous rules)
First First scanscan
AA#7#7
Second Second scanscan
B B #6#6 D D #5#5 Non frequentNon frequent
AA#7 #7 AA**
17
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
B B #9#9 C C #7#7 D D #8#8
Possible Items:Possible Items: A, B, C, D, E, FA, B, C, D, E, F
Algorithm (anomalous rules)Algorithm (anomalous rules)
First First scanscan
AA#7#7
Second Second scanscan
AA#7 #7 AA**
B B #6#6 D D #5#5
B B #9#9 BB** C C #7#7 CC** D D #8#8 DD**
C C #6#6 D D #7#7 D D #5#5
Candidate generationCandidate generation
18
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
Algorithm (anomalous rules)Algorithm (anomalous rules)
Rule generation: Rule generation:
Inmediate from the frequent itemsInmediate from the frequent items
19
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
ExperimentationExperimentation
El “Núcleo” de El “Núcleo” de X Y|A eses Y|A
No. Nucleos x Dataset
0
100
200
300
400
500
600
700
800
900
1000
post-operative_0.01_0.75
post-operative_0.05_0.75
disc_adult_0.01_0.75
disc_adult_0.05_0.75
disc_pima_0.01_0.75
disc_pima_0.05_0.75
disc_contraceptive_0.01_0.75
disc_contraceptive_0.05_0.75
breast_cancer_0.01_0.75
breast_cancer_0.05_0.75
disc_hepatitis_0.01_0.75
disc_hepatitis_0.05_0.75
disc_thyroid_0.01_0.75
disc_thyroid_0.05_0.75
nursery_0.01_0.75
nursery_0.05_0.75
w_breast_0.01_0.75
w_breast_0.05_0.75
No
. de
Ele
me
nto
s
20
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
ExperimentationExperimentation
if X
then
A
when not Y
X Y
X¬Y A
Usual Usual consequentconsequent
““Anomaly”Anomaly”
21
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
ExperimentationExperimentation
if NURSERY:very_critif NURSERY:very_crit
and HEALTH:priorityand HEALTH:priority
then then
CLASS:priority (9 out of 9)CLASS:priority (9 out of 9)
when not CLASS:spec_priorwhen not CLASS:spec_prior
Nursery:Nursery:
Usual Usual consequentconsequent
““Anomaly”Anomaly”
22
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
ExperimentationExperimentation
if WORKCLASS: Local-govif WORKCLASS: Local-gov
then then
CAPGAIN: [99999.0 , 99999.0] (7 out of 7)CAPGAIN: [99999.0 , 99999.0] (7 out of 7)
when not CAPGAIN: [0.0 , 20051.0]when not CAPGAIN: [0.0 , 20051.0]
Census:Census:
Usual Usual consequentconsequent
““Anomaly”Anomaly”
23
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
ConclusionsConclusions
We have introduced an alternative We have introduced an alternative type of interesting knowledge: type of interesting knowledge:
anomalous association rulesanomalous association rules
We have given an efficient algorithm We have given an efficient algorithm to detect all the anomaliesto detect all the anomalies
24
Introduction
Related Work
Our Definition
Algorithm
Experimentation
Conclusions
ConclusionsConclusions
Future Work:Future Work:
To complete experimentationTo complete experimentation
To filter the anomalies, eliminating To filter the anomalies, eliminating redundant rulesredundant rules
To introduce measures of interest To introduce measures of interest for the anomalies, allowing their for the anomalies, allowing their orderingordering