This is the major motivation for the robust wsd tasks of the adhoc track of. Unige experiments on robust word sense disambiguation. The effect of different context representations on word. On robustness and domain adaptation using svd for word sense. Learning a robust word sense disambiguation model using hypernyms in definition sentences. In computational linguistics, word sense disambiguation wsd is an open problem concerned with identifying which sense of a word is used in a sentence. Abstract this paper describes the participation of the ixa nlp group at the clef 2008 robustwsd.
Word sense disambiguation wsd has been a basic and ongoing issue since its introduction in natural language processing nlp community. We follow this scenario by proposing an unsupervised technique that disambiguates and annotates words by their specific sense, considering their. Most of the work is based on the use of wordnet 3 as a semantic resource. In computational linguistics, wordsense disambiguation wsd is an open problem concerned with identifying which sense of a word is used in a sentence. Machine learning techniques for word sense disambiguation. This paper presents a robust method for collective disambiguation, by harnessing context from knowledge bases and using a new form of coherence graph. An iterative sudoku style approach to subgraphbased. For example, given the patient complained of cold hands, a word sense disambiguation system might be asked to select between two di. Pdf learning a robust word sense disambiguation model.
Section 4 provides implementation details for three word sense disambiguation problems. In wsd the goal is to tag each ambiguous word in a text with one of the senses known a priori. These models consist of a parametric form and parameter estimates. For example, a dictionary may have over 50 different senses of the word play, each of these having a different meaning based on the context of the words usage in a sentence, as follows. Starting with the hyperlinks available in wikipedia, we show how we can generate sense annotated corpora that can be used for building accurate and robust sense classi. Robust word sense disambiguation systems using machine learning approaches y95, s98, m02 and dictionary based approaches l86, pbp03, mtf04 have been developed in the past.
Wsd is an aicomplete problem, that is, a problem having its solution at least as hard as the most difficult problems in the field of artificial intelligence. Wsd is considered an aicomplete problem, that is, a task whose solution is at least as. What is the best window size for word sense disambiguation has been long a problem. Multisense word embeddings were devised to alleviate these and other problems by representing each wordsense separately, but studies in this area are still in its infancy and much can be explored. The second chapter describes some earlier approaches to word sense disambiguation and.
Navigli09 is the more general task of mapping content words to a prede. Autoextend to produce token embeddings from a set of synonyms synsets and lexemes using a pre. Manual construction of deep and rich semantic lkbs is a titanic task, with many. The solution to this problem impacts other computerrelated writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference. Pdf word sense disambiguation is a technique in the field of natural language processing where the main task is to find the correct sense in which a. Based on the relevant occurrences of ambiguous words, we modify the training objective of skipgram to learn word. This paper proposes a method to improve the robustness of a word sense disambigua tion wsd system for japanese. Word sense disambiguation with multilingual features. The former will be suitable for the disambiguation of high frequency. Multi sense word embeddings were devised to alleviate these and other problems by representing each word sense separately, but studies in this area are still in its infancy and much can be explored. Word sense disambiguation wsd is the ability to identify the meaning of words in context in a computational manner. One of the fundamental tasks in natural language processing is word sense disambiguation wsd. Using wikipedia for automatic word sense disambiguation. Context is the only means to identify the sense of a polysemous word.
It is debatable how important an improvement of 2 or 4 percentage points is. Mar 12, 2018 word sense disambiguation wsd is a specific task of computational linguistics which aims at automatically identifying the correct sense of a given ambiguous word from a set of predefined senses. Learning a robust word sense disambiguation model using. This dissertation pursues corpusbased approaches that learn probabilistic models of word sense disambiguation from large amounts of text. Knowledge sources for word sense disambiguation arxiv. In spite of these advances, the accuracy of disambiguation is still rather low. Using the wordnet hierarchy, we embed the construction of abney and light 1999 in the topic model and show that automatically learned domains improve wsd accuracy compared to alternative contexts. Word sense disambiguation seminar report and ppt for cse. Towards robust high performance word sense disambiguation of english verbs using rich linguistic features. The algorithm uses these prop erties to incrementally identify collocations for tar get senses of a word, given a few seed collocations 1note that the problem here is sense disambiguation.
On the one hand, wsd will clearly not revolutionise ir or render it a solved problem. The use of supervised learning for word sense disambiguation is an active area of research. A unified model for word sense representation and disambiguation. The sense representations can be applied to many language understanding tasks including word sense disambiguation. This is the process of assigning a sense to a word, where that sense is found in a dictionary or other pre determined sense inventory. Resnik discussed a probabilistic model that captures the cooccurrence behavior of predicates and conceptual classes in a taxonomy for noun sense disambiguation 4. The best reported result for word sense disambiguation is 71. In linguistics, a word sense is one of the meanings of a word. Supervised word sense disambiguation wsd for truly polysemous words in contrast to homonyms is difficult for machine learning, mainly due to two problems. Apr 21, 2020 if the programs word sense disambiguation is not robust enough, or if additional clues are absent, the program can make errors in translation.
Pdf approaches for word sense disambiguation a survey. Robust utilization of context in word sense disambiguation. Robust semisupervised and ensemblebased methods in word. Explore word sense disambiguation with free download of seminar report and ppt in pdf and doc format. Word sense disambiguation wsd can be defined as the aptitude to recognize the meaning of words in the given context in a computational manner. Given word vectors and sense vectors, we propose two simple and efcient wsd algorithms to obtain more relevant occurrences for each sense. This paper describes experiments for the clef 2008 robust wsd task, both for the monolingual english and the bilingual spanish to english subtasks. The disambiguation process replaced each occurrence of a term composed by one or more words by an xml element containing the term identifier, an extracted lemma, a partofspeech pos tag noun, verb, adjective, the original word form w f and a list of senses together with their respective scores.
Robust and efficient page rank for word sense disambiguation. Different contexts generally give different results even for a same algorithm. Two wsd classifiers are trained from a word sensetagged corpus. Abstract this task was meant to compare the results of two different retrieval techniques. Section 3 presents the structural semantic interconnection algorithm and describes the contextfree grammar for detecting semantic interconnections. While the ned problem is similar, it faces the challenges that the ambiguity of entity names tends to be much higher e. Unige experiments on robust word sense disambiguati on. A word sense corresponds either neatly to a seme the smallest possible unit of meaning or a sememe larger unit of meaning, and polysemy of a word of phrase is the property of having multiple semes or sememes and thus multiple senses. Lexical ambiguity resolution or word sense disambiguation wsd is the.
Word sense disambiguation is a task of finding the correct sense of the words and automatically assigning its correct sense to the words which are polysemous in a particular context. Selecting the most appropriate sense for an ambiguous word is a common problem in natural language processing. Searching semantic resources for complex selectional. Pdf robust disambiguation of named entities in text. Wsd is considered an aicomplete problem, that is, a task whose solution is at least as hard as the most dif. The solution to this problem impacts other computerrelated writing, such as discourse, improving relevance of search engines, anaphora resolution, coherence, and inference the human brain is quite proficient at wordsense disambiguation. Integrating word sense disambiguation into an information retrieval system could potentially improve its performance. All algorithms for word sense disambiguation make use of information within a context window of the target word. This paper proposes a method to improve the robustness of a word sense disambiguation wsd system for japanese. Its application lies in many different areas including sentiment analysis, information retrieval ir, machine translation and knowledge graph construction. Pdf learning a robust word sense disambiguation model using. Two wsd classifiers are trained from a word sense tagged corpus. In addition to if, then rules of the shallow approach, algorithms are also used to determine correct interpretations.
Word sense disambiguation 2 wsd is the solution to the problem. Disambiguating named entities in naturallanguage text maps mentions of ambiguous names onto canonical entities like people or places, registered in a knowledge base such as dbpedia or yago. Research in word sense disambiguation wsd has a long history, as long as. Feb 05, 2016 word sense disambiguation, wsd, thesaurusbased methods, dictionarybased methods, supervised methods, lesk algorithm, michael lesk, simplified lesk, corpus le.
Through word sense disambiguation experiments performed on the wikipediabased sense. In arabic, the main cause of word ambiguity is the lack of diacritics of the most digital documents so. On robustness and domain adaptation using svd for word. In this paper an adaptation of the pagerank algorithm recently proposed for word sense disambiguation is. In general terms, word sense disambiguation wsd involves the association. Although humans solve ambiguities in an effortlessly manner, this matter remains an open problem in computer science, owing to the complexity. Pdf unige experiments on robust word sense disambiguation. This task is defined as the ability to computationally detect which sense is being conveyed in a particular context. We tried several query and document expansion and translation strategies, with and. Proceedings of the 20th international conference on computational linguistics. Abstract this report describes our approach to the robust word sense disambiguation task. We describe how conventional subgraphbased wsd treats the two steps of 1 subgraph construction and 2 disambiguation via graph centrality measures as ordered and atomic.
All algorithms for word sense disambiguation make use of information within a context. A wordnetbased algorithm for word sense disambiguation. Multisense embeddings through a word sense disambiguation. University of groningen linguistic knowledge and word sense.
27 427 131 812 244 125 762 911 1622 1259 252 1431 1253 605 1092 822 1168 631 420 433 386 1597 495 658 63 824 281 1511 400 776 1449 898 211 971 1443 293 884 390 1301 921