Computing Word-Pair Antonymy

download Computing Word-Pair  Antonymy

of 23

  • date post

    23-Feb-2016
  • Category

    Documents

  • view

    14
  • download

    0

Embed Size (px)

description

Computing Word-Pair Antonymy. * Saif Mohammad *Bonnie Dorr φ Graeme Hirst *Univ. of Maryland φ Univ. of Toronto EMNLP 2008. Introduction. Antonymy : pair of semantically contrasting words. Ex: Strongly antonymous: Hot Cold Semantically contrasting:Enemy Fan - PowerPoint PPT Presentation

Transcript of Computing Word-Pair Antonymy

Computing Word-Pair Antonymy

Computing Word-Pair Antonymy*Saif Mohammad*Bonnie DorrGraeme Hirst

*Univ. of MarylandUniv. of Toronto

EMNLP 2008

IntroductionAntonymy: pair of semantically contrasting words.Ex: Strongly antonymous: HotColdSemantically contrasting:EnemyFanNot antonymous:PenguinClownUsageDetecting contradictionsDetecting humorAutomatic creation of thesaurusProblem DefinitionGiven a thesaurus, find out the antonymous category pairs.Assign the degree of antonymy to each pair of antonymous categories.Hypothesis(1)The Co-occurrence Hypothesis of AntonymsAntonymous word pairs occur together much more often than other word pairs.Hypothesis(1)Empirical proof:1,000 antonymous pairs from Wordnet1,000 randomly generated word pairsUse BNC as corpus, set window size 5.Calculate the MI for each word pairs and average it

AverageStandard deviationAntonymous pair0.942.27Random pair0.010.37Hypothesis(2)The Distributional Hypothesis of AntonymsAntonyms occur in similar contexts more often than non-antonymous wordsEx work: activity of doing job play: activity of relaxationHypothesis(2)Empirical proof:Use the same set of word pairs in hypothesis(1)Calculate the distributional distance between their categories

AverageStandard deviationAntonymous pair0.300.23Random pair0.230.11Distributional Distancebetween Two Thesaurus Categories

c1,c2: thesaurus categoryI(x,y):pointwise mutual information between x and yT(c):the set of all words w such that I(c,w)>0MethodDetermine pairs of thesaurus categories that are contrasting in meaningUse the co-occurrence and distributional hypotheses to determine the degree of antonymy of word pairsMethod

16 affix rules were applied to Macquarie Thesaurus 2,734 word pairs were generated as a seed set.

Exceptions: sectXinsectRelatively few

Method10,807 pairs of semantically contrasting word pairs from WordNetMethodIf any word in thesaurus category C1 is antonymous to any word in category C2 as per a seed antonym pair, then the two categories are marked as contrasting.If no word in C1 is antonymous to any word in C2, then the categories are considered not contrastingMethodDegree of antonymy----category levelBy distributional hypothesis of antonyms, we claim that the degree of antonymy between two contrasting thesaurus categories is directly proportional to the distributional closeness of the two conceptsMethodDegree of antonymy----word leveltarget words belong to the same thesaurus paragraphs as any of the seed antonyms linking the two contrasting categories highly antonymoustarget words do not both belong to the same paragraphs as a seed antonym pair, but occur in contrasting categories medium antonymoustarget words with low tendency to co-occur lowly antonymousMethodAdjacency HeuristicMost thesauri are ordered such that contrasting categories tend to be adjacentEvaluation1,112 Closest-opposite questions designed to prepare students for GRE(Graduate Record Examination)162 questions as the development set950 questions as the test setEvaluationClosest-opposite questionsEx:adulterate: a. renounce b. forbid c. purify d. criticize e. correctEvaluationClosest-opposite questionsEx:adulterate: a. renounce b. forbid c. purify d. criticize e. correctEvaluation

DiscussionThe automatic approach does indeed mimic human intuitions of antonymy.In languages without a wordnet, substantial accuracies may be achieved.Wordnet and affix-generated seed are complementary.ConclusionProposed an empirical approach to antonymy that combines corpus co-occurrence statistics with the structure of a thesaurus.The system can identify the degree of antonymy between word pairs.An empirical proof that antonym pairs tend to be used in similar contexts.Thanks