[IEEE 2006 IEEE International Conference on Information Acquisition - Veihai, China...

5
Proceedings of the 2006 IEEE International Conference on Information Acquisition August 20 - 23, 2006, Weihai, Shandong, China Improvement to the Criterion w(x, y) in Positive Reduct of Decision Table Gao Yan1, Du Weifeng2, Qin Keyun2, Wang Fulil 'College of Computer Science & Technology, Henan Polytechnic University, Jiaozuo, Henan Province, China 2School ofInformation Science & Technology, Southwest Jiaotong University, Chengdu, Sichuan Province, China [email protected] Abstract - The paper makes improvement to the condition ind(R) = ind(R - {R}) co(x,y) which the elements in discernibility matrix introduced by holds, we say R iS R redundant, or we say R iS R necessary. Skowron satisfy. Thus we can get two advantages: the If each R e R is R necessary, R is independent, or R is verification to the condition is simpler; the elements satisfying the dependent. condition in discernibility matrix are less. Then the complexity of Suppose that S c R if s is independent, and computing reduct with discernibility function decreases. Such - ' property is effective not only to manual operation but also to id(S)=ind(R), then s is areduct of R. computer processing. The paper is organized as follows: Firstly, So in information system, deserting some attributes will the starting point of the idea in this paper is presented vividly not influence its classification capability. We only need to from two facets in the positive region chart; Secondly, we give the hold the subsets which can constitute reduct, thus the new proof to the improved condition; Lastly, we illustrate the validity information system will have the same classification of the improved condition. capability with the original one. A natural idea is to ransack all Index Terms - decisiontale,reductdiscernibilthe subsets of R to acquire all the reducts. But unfortunately, disnd Terms ' dciso t ransacking all the subsets of some set is an NP problem, so it is infeasible practically. I. INTRODUCTION In 1992, Prof Skowron of Warsaw University introduced discernibility matrix and discernibility function. He pointed Rough sets theory is a new mathematical tool to handle out all the conjuncts of the minimal disjunctive normal form uncertainty and incomplete information[1,2]. It was initially of the discernibility function are all the reducts of the proposed by Polish mathematician Pawlak[1] in 1982. After attributes set A . Though the algorithm of transforming the more than twenty years' research and development, it has conjunctive normal form into disjunctive normal form of the made great progress in theoretical aspects and applications. discernibility function is still exponentially complex. But the Especially, it obtained broad attentions after its successful use method is simple, clear, and easy to operate, and besides, its in knowledge discovery. Now, it has applied to a broad complexity of computation can be reduced greatly using domain such as artificial intelligence, knowledge discovery in absorption law of Boolean expression. It is feasible when the database, pattern recognition, failure detection, etc. scale of the problem is not too big. This is the best approach to The rough sets theory believes that knowledge is get all the accurate reducts, and it can be a criterion to test essentially a kind of capability of classification. Such other heuristic algorithm. And besides, the thought of solving capability exists not only in human beings, but also in other problem the method provide has important meaning. species. The capability of classification incarnates the knowledge who owns. A knowledge base is just a relational II. DISCERNIBILITYFUNCTIONOFDECISIONTABLE system K = (U, R) , the equivalence relation family R In the rough sets theory, a decision table can be denoted characterizes the classification to universe u . by a 3-tuple S = (U, A,d), where u is universe, lUl = n, A is the As P s R, P .0, then nP is also an equivalence relation, conditional attributes set, d is decision attribute. The which is called indiscernibility relation on P, and denoted as discernibility matrix of the decision table is a nxn matrix, ind(P) . each element in it is: As to two knowledge bases K = (U, R) and K =(U, R'), a* (x, y) ={aE A a(x) . a(y) A CO(x, y)} when ind(R) = ind(R') , we say that K and K' is equivalent, In order to obtain positive reduct (since only positive denoted as K= K' . reduct is discussed here, it is denoted simply as reduct) Knowledge reduct is one of the main contents of rough vx, yE U, CO(x, y) iS[3]: sets theory. As we all know, each attribute in knowledge base xe posA(d)-A) posA(d) (1) (relational system) iS not identically important. Even some attributes are redundant. Knowledge reduct is to desert or irrelevant or unimportant knowledge in keeping the xx~posA(d) AYs posA(d) (2) classification capability to the knowledge base[4]. In order to or discuss the reduct, we define as follows: X,ye pOSA(d)A(X,y)0 ifd(d) ( 3) R is a family of equivalence relation, R E R, if 1 -4244-0529-7/06/$20.OO ©2006 IEEE 781

Transcript of [IEEE 2006 IEEE International Conference on Information Acquisition - Veihai, China...

Proceedings of the 2006 IEEEInternational Conference on Information AcquisitionAugust 20 - 23, 2006, Weihai, Shandong, China

Improvement to the Criterion w(x, y)in Positive Reduct of Decision Table

Gao Yan1, Du Weifeng2, Qin Keyun2, Wang Fulil'College ofComputer Science & Technology, Henan Polytechnic University, Jiaozuo, Henan Province, China

2School ofInformation Science & Technology, Southwest Jiaotong University, Chengdu, Sichuan Province, [email protected]

Abstract - The paper makes improvement to the condition ind(R) = ind(R -{R})co(x,y) which the elements in discernibility matrix introduced by holds, we say R iS R redundant, or we say R iS R necessary.Skowron satisfy. Thus we can get two advantages: the If each R e R is R necessary, R is independent, or R isverification to the condition is simpler; the elements satisfying the dependent.condition in discernibility matrix are less. Then the complexity of Suppose that S cR if s is independent, andcomputing reduct with discernibility function decreases. Such - 'property is effective not only to manual operation but also to id(S)=ind(R), then s is areduct of R.computer processing. The paper is organized as follows: Firstly, So in information system, deserting some attributes willthe starting point of the idea in this paper is presented vividly not influence its classification capability. We only need tofrom two facets in the positive region chart; Secondly, we give the hold the subsets which can constitute reduct, thus the newproof to the improved condition; Lastly, we illustrate the validity information system will have the same classificationof the improved condition. capability with the original one. A natural idea is to ransack all

Index Terms - decisiontale,reductdiscernibilthe subsets of R to acquire all the reducts. But unfortunately,disnd Terms 'dciso t ransacking all the subsets of some set is an NP problem, so it

is infeasible practically.I. INTRODUCTION In 1992, Prof Skowron of Warsaw University introduced

discernibility matrix and discernibility function. He pointedRough sets theory is a new mathematical tool to handle out all the conjuncts of the minimal disjunctive normal form

uncertainty and incomplete information[1,2]. It was initially of the discernibility function are all the reducts of theproposed by Polish mathematician Pawlak[1] in 1982. After attributes set A . Though the algorithm of transforming themore than twenty years' research and development, it has conjunctive normal form into disjunctive normal form of themade great progress in theoretical aspects and applications. discernibility function is still exponentially complex. But theEspecially, it obtained broad attentions after its successful use method is simple, clear, and easy to operate, and besides, itsin knowledge discovery. Now, it has applied to a broad complexity of computation can be reduced greatly usingdomain such as artificial intelligence, knowledge discovery in absorption law of Boolean expression. It is feasible when thedatabase, pattern recognition, failure detection, etc. scale of the problem is not too big. This is the best approach to

The rough sets theory believes that knowledge is get all the accurate reducts, and it can be a criterion to testessentially a kind of capability of classification. Such other heuristic algorithm. And besides, the thought of solvingcapability exists not only in human beings, but also in other problem the method provide has important meaning.species. The capability of classification incarnates theknowledge who owns. A knowledge base is just a relational II. DISCERNIBILITYFUNCTIONOFDECISIONTABLEsystem K = (U, R) , the equivalence relation family R In the rough sets theory, a decision table can be denotedcharacterizes the classification to universe u . by a 3-tuple S = (U, A,d), where u is universe, lUl = n, A is the

As P s R, P .0, then nP is also an equivalence relation, conditional attributes set, d is decision attribute. Thewhich is called indiscernibility relation on P, and denoted as discernibility matrix of the decision table is a nxn matrix,ind(P) . each element in it is:

As to two knowledge bases K = (U, R) and K =(U, R'), a* (x, y) ={aE A a(x) . a(y) A CO(x, y)}when ind(R) = ind(R') , we say that K and K' is equivalent, In order to obtain positive reduct (since only positivedenoted as K= K' . reduct is discussed here, it is denoted simply as reduct)

Knowledge reduct is one of the main contents of rough vx, yE U, CO(x,y) iS[3]:sets theory. As we all know, each attribute in knowledge base xe posA(d)-A) posA(d) (1)(relational system) iS not identically important. Even someattributes are redundant. Knowledge reduct is to desert orirrelevant or unimportant knowledge in keeping the xx~posA(d)AYs posA(d) (2)classification capability to the knowledge base[4]. In order to ordiscuss the reduct, we define as follows: X,ye pOSA(d)A(X,y)0 ifd(d) ( 3)

R is a family of equivalence relation, RE R, if

1-4244-0529-7/06/$20.OO ©2006 IEEE781

The discernibility function of decision table is defined asfollows:

A'= A v a(x,y)

The discernibility function has the property that all theconjuncts of the minimal disjunctive normal form of thediscernibility function are all the d reducts of the conditionalattributes set A .

The conclusion is definitely true. But the condition co(x,y)can be simplified which we can see from two aspects.

The discussion of first aspect can be induced from Fig.l(a). We can see how to simplify the conditionintuitionistically. This is a 4-class decision table. (c) reduct under extreme situation possibly happeningTheoretically, in the extreme situation, after positive reduct Fig.1 Positive Region and Boundary Region of 4-Class Decision Tablethe universe may be even reducted into 5 classes at least, 4.. .. ' ~~~~~~~~~~~~~Itis obvious to see that the elements in the same decisionpositive classes and 1 boundary class, and the positive region class needn't to be distinguished, and the elements inof decision table will not change (Fig. l(c)). As to n-class claryreedn t to be d istinduthe elementivedecision table, in the extreme situation, after positive reduct bounda regiona to be ditinge. hel pienthe universe may be reducted into n+± classes at least, n rgo o dbtapositive classes and 1 boundary class, and the positive region ,sdvxuof decision table will not change. Of course, due to the X pos(d)V(x,y)eind(d) (4)practical situation of each attribute in concrete decision table, a#(x, y) = 0 (here, in order to distinguish the symbol a(x, y) insuch extreme-situation reduct is unlikely to happen. In a discernibility matrix proposed by Skowron, we denote it asgeneral way the reduct is shown as in Fig. 1(b). a#'(x,y) , similarly, v(x,y) substitutes for co(x,y) as the

condition discussed in the following chapters).In other words, in order to acquire positive reduct, we just

need to change the elements which satisfy the condition (4)in discernibility matrix of the information systemcorresponding to this decision table into empty set. Or, weonly need to keep the elements which satisfy the conversecondition of (4) The converse condition, i.e. v(x,y), issolved as follows:

x,y posA(d)v(x,y)E ind(d)= xX posA(d)Ay posA(d)v(x,y)E ind(d)= (x, y) o ind(d) A (x E POSA(d) v y E PoSA(d))

(a) initial situation So, Vx,yE U,v(x,y) is:(x,y) ind(d)A(xE posA(d)vyE posA(d)) (5)

Obviously, determining v(x,y) is simpler thandetermining co(x,y).

Now, let's discuss the condition from another aspect.Satisfying condition ( , i.e. xe posA (d) A y J posA (d)such y has three types. We denote it with y, y' and y"respectively in Fig. 1(a).

(1). Type y i.e. xe posA(d)AY posA(d) and(x, y) i~ind(d) and [Y]An0[X]d .0;

(2). Type y' i.e. xecposA(d)Ay'JPOSA(d) and

(x, y ') E ind(d)(b) reduct under general situation ' nd(d)Under such condition, there must exist type y elementscorresponding to the element y', such that (y, y ') E ind(A),then a(x,y') =a(x,y) Since discernibility function is

A\= AV OC(X, y), and a(x, y) will inevitably emerge in the

discernibility matrix, according to the idempotent law of

782

conjunctive operator, it has no influence to the value of VzE [Y]B , then (y, z) E ind(B) , hence Bfl#(y, z) =0 bydiscernibility function if a(x, y') emerges in the discernibility the condition Va#(x,y) . 0,B nf #(x y).0 , we have av#(y, z) = 0,matrix. thus

(3). Type y", i.e. xe posA(d)Ay" posA(d) and (i)v(y,z) doesn't hold(x, y ") i ind(d) and [y "]A n [X]d = 0 (ii) (y, z) E ind(A)

So, it is enough to have type y and type y" elements in If ( i ), i.e. v(y,z) doesn't hold, by ye A([xld) , i.e.discernibility matrix. All the other elements can be changed yE posA(d) , we get (y,z) E ind(d)into empty set without changing the ultimate value of By ye A([xld) again, then (x, y) E ind(d), by the transitivitydiscernibility function. Since type y and type y" elements . .

of indiscernibility relation, then (x,z) e ind(d)both satisfy x E posA (d) A y Jt posA (d) and (x, y) X ind(d), Thus z E [Xld, so we have [YXB[X1din other words, condition ( 1 ) can change into ..xe POSA(d) A Y posA(d) A (x, y) J ind(d) without changing Tfh(e), ze [Y]A 5d[X]d 0[(-- 5[the value of discernibility function by attaching Therefore, posX(d) - posB(d)(x, y) i ind(d) after it. By ( I ) (II) we have posA(d)=posB(d), sufficiency

Similarly, condition ( 2 ) can change into holds.x i PosA (d) A posA (d) A (X, Y) J ind(d). Thus, condition (2)Secondly we prove necessity

a" (X,y) .0 , then v(x,y) , i.e.co(x, y) change into (x, y) o ind(d) A (xE posA(d) v yE posA(d)) , it is equivalent to

xE posA (d) A Y Jt posA (d) A (x, y) Jt ind(d) (x, y) o ind(d) A X E posA(d) v (x, y) 0 ind(d) AyE posA(d) , thus we canor prove it in two parts:

xXi~pos,(d)AyG pos,(d)A(X,Y)J0 ind(d) ( I ) if (x,y) ind(d) AXE POSA(d) holdsor xE POSA(d) , by POSB(d)=posA(d) we have xe POSB(d)

X, Y Xpo5A (d) A (X, Y) ind(d) then [XIB ( [, and by (x, y) 0 ind(d) , we have y X[xld ItIt is easy to prove that this condition is completely

equivalent to v(x,y) mentioned in the previous discussion. f t

Therefore, from these two aspects, co(x,y) can be substituted Bna# (x, y) .0

by v(x, y), the much simpler condition, without changing the (II) if (x, y) o ind(d) AY posA(d) holds

value of discernibility function. Ultimately, the acquired Similarly provedpositive reduct is necessarily identical. We give out proof in By ( I )' (II), necessity holds.the following section. By (1), (2), the lemma holds.

Theorem 1 If B A is the minimum subset satisfyingIII. IMPROVEMENT TO THE CONDITION OFNONEMPTYVa(y.0,fl(xy.,BisdrucofA

ELEMENTS IN DISCERNIBILITY MATRIXProof To prove B is d reduct of A, we need to prove

As to decision table S = (U, A, d) , where u is universe, (1) posB(d) = posA(d)U = n , A is the conditional attributes set, d is decision (2) B is d independentattribute. The discernibility matrix is a nxn matrix, each (1) By the sufficiency of the lemma, it follows directlyelement in it is: that posB(d) = posA(d)

#(x, y) = ae AIa(x) . a(y) AV(x,Y) (2) Suppose that B is not d independent, then thereVx,yeU,v(x,y) is: exists B'c- B satisfying POSB,(d) = posA(d), by the necessity of(x~~~~~~~~~~~~~~stsyn pos(d by(d necesit of

d) os()(x,y) ind(d)A(Xe posA(d)vye POSA(d)) the lemma we obtain Va#(x,y).0,Bfna#(x,y).0 , this isLemma posB(d)= POSA(d) if and only if contradictory with the minimum property of B, the hypothesis

IV#(x, y) . 0,B na#(x, y) .0 , where B ( A does not hold, thus B is d independentProof (I)Firstly we prove sufficiency By (1)+ (2), B is d reduct of A.(I )posB(d) posA(d) is obvious Theorem 2 The discernibility function A# has the(II )We prove that posA(d) s posB(d) property that all the conjuncts of the minimal disjunctiveSince posA(d)= A_([xId)IpA[ s(d) = ([X1d) normal form of the discernibility function A# are all the d

[XdU/ [dU/ reducts of the conditional attributes set A .If we can prove that Vye A([x1)> ye B([xld) , i.e. Proof Suppose that the set of all the conjuncts of the

%/Y] [Xld > [Y]B 5 [Xld holds minimal disjunctive normal form of A# iS B, then

A,e v#(x,y)= BVAB

783

Let A' = A v a(x,y),A" = VAB , it is obvious that A# A#" (b) a"#(x,y)IX,J'EU

2B(=-2B3 45To prove the theorem, we need to prove the following 3 items: 1 2 3 4 S 6

(1) Va#(x,y).0,Bna`#(x,y).0 2 a

Suppose that 3]r# (x, y) . 0, B f #(x,y) = 0 , then we can let 3{ 1, ae B 4 a,b a,b

V 0, ot hers 5 a,b,c b,cThus (A" =0) (A" =1), contradiction! 6 a,b,c a, c a

(2) B is the minimum subset satisfying (1) , i.e. B iS The corresponding discernibility function of tablell(b) isreduct. # = a(a v b)(a v b v c)(a v b v c)(a v b)(b v c)(a v c)a

Suppose that B is not the minimum subset, then we have = ab v acBc-B satisfying Var#(x,y)0,B'fl"cr(x,y) .0 , then we can let Therefore, the decision table has two d reducts, ta,b} and

Vae A,a = , ae B' ta, c} . The result is identical with [4]. It is easy to see from the0UO, ot hers table that the new criterion simplifies discernibility matrix and

Thus (A2=1).(Al=0), contradiction! discernibility function. Comparing with the original criterion,

(3) All the B is all the reduct, namely, the disjunctive the simplification is made to the elements in the same decisionnormal form doesn't miss any reduct class. If two elements are in the same decision class, it follows

Suppose that B'o B is also a reduct, then we have B' directly that r#(x, y) =0 . Thus we can decrease thesatisfying Va#(x,y).0s,Bnf#(x,y) .0, then we can let computational complexity bitterly. The effect is more

A 1, ae B' prominent especially when the number of decision class isa

0, ot hers less.Thus (A = 1) (Al = 0), contradiction! V. CONCLUSIONS

By (1)+ (2) (3), the theorem holds. The paper makes improvement to the criterion co(x,y)IV. ILLUSTRATIVE EXAMPLE which the elements in discernibility matrix of positive reduct

of decision table satisfy, hence the complexity of theExample 1 Table I gives a decision table, where algorithm to find all the reduct decreases. It is effective not

A = {ta,b,c} , d iS decision attribute. The corresponding only to manual operation but also to computer processing. Thediscernibility matrix is given in table II. decrement of complexity mainly shows at two sides: 1 .The

verification to the criterion is simpler; 2. The elementsTABLE I satisfying the criterion in discernibility matrix are less.

DEcIsIoNTABLE(reffrom [4], p24 tab]le l. 6)U a b c d ACKNOWLEDGMENT1 2 2 0 1 This work was partially supported by Science Foundation2 1 2 0 0 of Henan Province(0611055800), Science & Technology

4 0 0 0 0 Tackling Project of Henan Province(0424460013), and the4 l 0 1 0 National Nature Science Foundation of China (60474022).6 2 0 1 1 REFERENCES

[1] Z. Pawlak, Rough sets, International Journal of Computer andInformation Science, 11(1982):341-356

[2] Z. Pawlak, Rough sets: Theoretical Aspects of Reasoning about Data,Kluwer Academic Publishers, Boston, 1991

[3] A. Skowron, C. Rauszer, The discernibility matrices and functions inTABLE II information systems, in: R. Slowinski (Ed.), Intelligent Decision

DISCERNIBILITY MATRIX Support - Handbook of Applications and Advances of the Rough Sets

(a) a*/(x,y) (ref from [4], p25 table 1. 7) Theory, Kluwer Academic Publishers, Dordrecht, 1992: 331 - 362.1 2 3 4 5 6 [4] ZHANG Wen-xiu, WU Wei-zhi, LIANG Ji-ye, LI De-yu, Rough sets

I theory and approach ' Science Press 2001

2 a [5] Pawlak Z.. Vagueness and Uncertainty-a Rough Set Perspective.3 Computational Intelligence. 1995,11(2): 227-2323a [6] Pawlak Z, Grzymala-Busse J, Slowinski R, Ziarko W. Rough sets.

4 a,b a,b a,b Communications of ACM. 1995, 38(11): 89~955 a,b,c b,c b,c [7] Ziarko W. Introduction to the special issue on rough sets and knowledge6 a,b,c a,b,c a, c a discovery. Computational Intelligence. 1995, 11(2): 223~225

[8] Ziarko W. Variable precision rough set model. Journal of Computer andSystem Sciences. 1993, 46(1): 39~59

784

[9] Beynon M. Reducts within the variable precision rough sets model: Afurther investigation. European Journal of Operational Research. 2001,134(3): 592605

[10]Chen X, Zhu S, et al. Entropy based uncertainty measures forclassification rules with inconsistency tolerance. Proceedings of the IEEEInternational Conference on Systems, Man and Cybernetics. Nashville:Institute of Electrical and Electronics Engineers Inc., 2000: 2816-2821

[I l]Jelonek J. et al. Rough Set reduction of attributes and their domains forneural networks. Computational Intelligence. 1995,11(2): 339-347

[12]Yasdi R.. Combining rough sets learning and neural learning method todeal with uncertain and imprecise information. Neural Computing andApplications. 1995, 7(1): 6184

785