Автоматические методы распознавания метафоры в текстах на русском языке тема диссертации и автореферата по ВАК РФ 10.02.21, кандидат наук Бадрызлова Юлия Геннадьевна

  • Бадрызлова Юлия Геннадьевна
  • кандидат науккандидат наук
  • 2019, ФГАОУ ВО «Национальный исследовательский университет «Высшая школа экономики»
  • Специальность ВАК РФ10.02.21
  • Количество страниц 206
Бадрызлова Юлия Геннадьевна. Автоматические методы распознавания метафоры в текстах на русском языке: дис. кандидат наук: 10.02.21 - Прикладная и математическая лингвистика. ФГАОУ ВО «Национальный исследовательский университет «Высшая школа экономики». 2019. 206 с.

Оглавление диссертации кандидат наук Бадрызлова Юлия Геннадьевна

Table of Contents


Chapter I. Metaphor as a computational problem

1. Annotated corpora and databases of metaphor

1.1. Top-down and bottom-up approaches to metaphor identification in discourse

1.2. MIPVU: a procedure for linguistic metaphor identification

1.3. VUAMC: the VU Amsterdam Metaphor Corpus

2. Computational approaches to metaphor identification: state-of-the-art

2.1. Klebanov, Leong, Gutierrez, Shutova, and Flor (2016)

2.2. Klebanov, Leong, Heilman, and Flor (2014)

2.3. Mu, Yannakoudakis, and Shutova (2019)

2.4. Bulat, Clark, and Shutova (2017)

2.5. Shutova, Kiela, and Maillard (2016)

2.6. Stemle and Onysko (2018)

2.7. Wu et al. (2018)

2.8. Turney, Neuman, Assaf, and Cohen (2011)

2.9. Hovy et al. (2013)

3. Computational metaphor identification systems for Russian

3.1. Strzalkowski et al. (2013)

3.2. Tsvetkov, Mukomel, and Gershman (2013)

3.3. Tsvetkov et al. (2014)

Summary of Chapter

Chapter II. Experimental corpus

4. Corpus design

4.1. Selection of data

4.2. Selection of target verbs

5. Corpus annotation

5.1. Non-metaphoric class


5.2. Metaphoric class

5.3. Distribution of metaphoric subclasses in the corpus

6. Annotation reliability test

6.1. Selection of sentences

6.2. Annotator instructions

6.3. Binarization of categorical annotation

6.4. Annotation results and analysis

Summary of Chapter II

Chapter III. Automated metaphor identification experiment

7. Motivation behind the choice of features

7.1. Motivation behind the use of distributional semantic feature

7.2. Motivation behind the use of lexical co-occurrence feature

7.3. Motivation behind the use of morphosyntactic co-occurrence feature

8. Data preprocessing and the context windows

9. The feature set

9.1. Distributional semantic features

9.2. Lexical co-occurrence features

9.3. Morphosyntactic co-occurrence feature

9.4 . Concreteness / abstractness feature

9.5. Flag words and quotation marks features

10. Experimental setup

11. Results

11.1. Evaluation of alternative parameters of the features

11.2. Window sensitivity

11.3. Inefficient features

11.4. Classification results

Summary of Chapter III

Chapter IV. Linguistic analysis of experimental results

12. Discussion: results of the lexical classifier and their implications

12.1. Correlation between lexical diversity of MET and NONMET subcorpora and performance of the lexical classifier

12.2. Feature importance

12.3. Detecting possible lexical predictors

12.4. Correlation between metaphor association and concreteness

13. Discussion: results of the distributional semantic classifier

13.1. Linguistic interpretation of the performance across datasets

13.2. Correlation between metaphoricity, semantic similarity, and accuracy

14. Discussion: results of the morphosyntactic classifier

14.1. Correlation between metaphor association of grammatical categories and the performance of the morphological classifier

14.2. Feature importance

Summary of Chapter IV

Thesis summary

List of References

List of tables

List of figures

Appendix 1. Annotator guidelines for the inter-annotator reliability test (Chapter II. Section 3.2)

Appendix 2. Concrete ('thingness') paradigm words (Chapter III. Section

Appendix 3. Abstract paradigm words (Chapter III. Section 9.4)

Рекомендованный список диссертаций по специальности «Прикладная и математическая лингвистика», 10.02.21 шифр ВАК

Введение диссертации (часть автореферата) на тему «Автоматические методы распознавания метафоры в текстах на русском языке»


Metaphor occupies a prominent place in contemporary linguistic theory: it is recognized to be one of the most powerful cognitive tools with which humans conceptualize (Lakoff & Johnson, 1980a). Evidence from psycholinguistic research demonstrates that metaphor guides reasoning and decision-making in societal (Thibodeau & Boroditsky, 2011), economic (L. Jia & Smith, 2013; Landau, Keefer, & Rothschild, 2014; Morris, Sheldon, Ames, & Young, 2007; Robins & Mayer, 2000), health-related (Gallagher, McAuley, & Moseley, 2013; Hauser & Schwarz, 2015; Hendricks & Boroditsky, 2016; Scherer, Scherer, & Fagerlin, 2015), educational (Landau, Oyserman, Keefer, & Smith, 2014), and environmental (Flusberg, Matlock, & Thibodeau, 2017; Mio, Thompson, & Givens, 1993) issues.

Metaphor is truly ubiquitous in everyday discourse and it forms a fundamental part of the language system. Metaphor's pervasiveness is estimated invariably high: in a multi-domain corpus, on the average, 0.3 (single-word) metaphor occurs in a sentence (Shutova & Teufel, 2010); in genre-specific corpora, the frequency of metaphor ranges within 5-18% of the total number of words (G. J. Steen et al., 2010). Sardinha (2008) estimated the statistical probability for a word form to occur metaphorically in a general-domain corpus as 0.7.

Not surprisingly, metaphor identification and interpretation pose a serious challenge to a wide range of real-world NLP applications, such as information retrieval, machine translation, question answering, information extraction, opinion mining, and others. Computational work on metaphor identification and interpretation began in the early-mid 1990s (Fass, 1991; Martin, 1990, 1994); the latest advances in corpus linguistics and machine learning sparked a large-scale wave of computational metaphor projects. A series of Workshops on Metaphor in NLP was held for several successive years as a part of the NAACL-HLT conference (Klebanov, Shutova, & Lichtenstein, 2014, 2016; Shutova, Klebanov, & Lichtenstein, 2015; Shutova, Klebanov, Tetreault, & Kozareva, 2013). The first competition of NLP systems in a shared metaphor detection task was held in 2018 (Leong, Klebanov, & Shutova, 2018). A comprehensive overview of state-of-the-art approaches to automated metaphor identification is available in (Veale, Shutova, & Klebanov, 2016).

Metaphor identification systems and metaphor-annotated datasets may be primarily divided into the two major groups - those that operate within the theoretic paradigm of conceptual metaphor (CM) (Lakoff & Johnson, 1980a), on the one hand, and those that do not make any a priori assumptions about the underlying conceptual mechanisms of metaphor and focus on linguistic metaphor (LM).

A conceptual metaphor "consists of two conceptual domains, where one domain is understood in terms of another" (Kovecses, 2010). The domain which lends its conceptual structure to another domain is referred to as source domain; the domain which is conceptualized in terms of the source domain is called target domain. There is "a set of systematic correspondences between the source and the target in the sense that constituent conceptual elements of [the target] correspond to constituent elements of [the source]. Technically, these conceptual correspondences are often referred to as mappings" (ibid). Among the projects for conceptual metaphor identification are: (Dodge, Hong, & Stickles, 2015; Gandy et al., 2013; Gedigian, Bryant, Narayanan, & Ciric, 2006; Heintz, Gabbard, Srivastava, et al., 2013; Mohler, Bracewell, Hinote, & Tomlinson, 2013; Mohler, Rink, Bracewell, & Tomlinson, 2014; Ovchinnikova et al., 2014; Rosen, 2018; Shutova & Sun, 2013; Shutova, Sun, Gutiérrez, Lichtenstein, & Narayanan, 2016; Stowe & Palmer, 2018; Strzalkowski et al., 2013).

A linguistic metaphor is "a stretch of language that creates the possibility of activating two distinct domains" (Cameron, 2003). Systems designed within the LM paradigm aim to identify any stretches of text that contain indirectly used lexical units: (Badryzlova & Panicheva, 2018; Hovy, Srivastava, et al., 2013; Klebanov, Leong, Heilman, & Flor, 2014; Klebanov, Leong, Gutierrez, Shutova, & Flor, 2016; Krishnakumaran & Zhu, 2007; Neuman et al., 2013; Panicheva & Badryzlova, 2017; Shutova, Kiela, & Maillard, 2016; Tsvetkov, Boytsov, Gershman, Nyberg, & Dyer, 2014; Tsvetkov, Mukomel, & Gershman, 2013; Turney, Neuman, Assaf, & Cohen, 2011).

The first competition of metaphor identification systems - the VUA metaphor identification shared task (Leong et al., 2018) was held in 2018: the participating systems competed in identification of linguistic metaphor.

Besides the differences in the paradigm (CM vs. LM), experiments in computational identification of metaphor are differentiated by the settings in which they are designed - supervised, unsupervised, or deep learning.

Computational systems for metaphor identification also differ in the types of features exploited in them:

- Lexical features (e.g. Klebanov, Leong, et al., 2014);

- Morphological and syntactic features (e.g. Hovy, Srivastava, et al., 2013; Ovchinnikova et al., 2014);

- Distributional semantic features (e.g. Shutova, Kiela, et al., 2016);

- Topic modelling (e.g. Heintz, Gabbard, Srivastava, et al., 2013);

- Features from lexical thesauri and ontologies: WordNet (e.g. Gandy et al., 2013), FrameNet (e.g. Gedigian et al., 2006), VerbNet (e.g. Klebanov, Leong, et al., 2016), ConceptNet (Ovchinnikova et al., 2014), and the SUMO ontology (J. Dunn, 2013a, 2013b);

- Psycholinguistic features: concreteness / abstractness, imageability, affect, and force (e.g. Neuman et al., 2013; Strzalkowski et al., 2013; Turney et al., 2011).

Two more characteristics by which metaphor identification systems are differentiated are the type of analysis (binary classification or sequential labelling) and the unit of analysis (e.g. pairs or triples of syntactically related words, or all content words in a sentence).

As the majority of the state-of-the-art metaphor detection systems operate within the supervised setting, the role of annotated datasets for their training and testing becomes paramount. Just as metaphor identification systems, annotated datasets of metaphor can follow either of the two paradigms - the conceptual metaphor or the linguistic metaphor approach. The largest repositories of conceptual metaphor are MetaNet (Dodge et al., 2015) and the LCC Metaphor Dataset (Mohler, Brunson, Rink, & Tomlinson, 2016), both of which are multilingual. By far the largest corpus of linguistic metaphor is the VU Amsterdam Metaphor Corpus (G. J. Steen et al., 2010) which is available for English.

The central goal of this thesis is to provide in-depth linguistic analysis of context features which can be utilized in order to automatically differentiate utterances which contain linguistic metaphor from non-metaphoric ones. We address this goal by running several machine learning experiments for metaphor identification in Russian and by evaluating the importance of each of the proposed features; the latter evaluation is also performed using machine learning algorithms. It should be emphasized that the present work does not aim to engineer an algorithm which would maximize the performance on the metaphor identification task; rather, we intend to suggest feature extraction methods and to assess the efficiency of the extracted features.

The main goal of the thesis is accomplished via the following series of tasks:^

- to develop a customized scheme for annotation of linguistic metaphor at the sentence level;

- to collect and annotate a corpus of contexts containing linguistic metaphor, as well as non-metaphoric ones;

- to evaluate the quality of metaphor annotation;

- to suggest methods of feature engineering for identification of linguistic metaphor at the sentence level;

- to implement machine learning experiments for linguistic metaphor identification using models based on the suggested features and their combination;

- to evaluate the performance of the models and their generalizability for experiments on new datasets;

- to provide an in-depth linguistic analysis of the contextual factors that promote the success or failure of features.

The types of features to be explored in this research are:^

1. Semantic similarity;^

2. Lexical co-occurrence;^

3. Morphosyntactic co-occurrence;^

4. Concreteness indexes;^

5. Occurrences of flag words (lexical signals of metaphoricity) and quotation marks.^ The following methods and algorithms are used in the present research:

- a customized version of the MIPVU procedure for annotation of linguistic metaphor (G. J. Steen et al., 2010);

- distributional semantic models (Baroni, Dinu, & Kruszewski, 2014; Kutuzov & Kuzmenko, 2016);

- AP metric, a statistical measure of association (Ellis, 2006);

- Support Vector Machine algorithm;

- Random Forest algorithm;

- Logistic Regression algorithm;

- K-means clustering algorithm;

- Boruta algorithm (Kursa, Jankowski, & Rudnicki, 2010).

Relevance of the thesis. The bulk of the effort on metaphor annotation and computational metaphor identification has focused on English. Most of metaphor annotation projects for Russian which are known to us adhere to the conceptual metaphor paradigm, such as the Russian sections

of the multilingual resources (Dodge et al., 2015; Mohler et al., 2016). The only known to us Russian dataset of linguistic metaphor is the corpus compiled by Tsvetkov et al. (2014). However, this dataset in several regards differs from the corpus which was collected and annotated in the present study:

- the corpus by Tsvetkov and colleagues is smaller: its size amounts to a total of 240 sentences, while our corpus comprises more than 7,000 sentences; to the best of our knowledge, this is the largest currently existing Russian corpus annotated for linguistic metaphor;

- the corpus by Tsvetkov and coauthors does not concentrate on any specific set of target lexemes and covers a range of most frequent Russian verbs and adjectives; our corpus is designed around twenty target verbs: this allows us to explore the impact of the linguistic characteristics of verbs on the performance of classification features;

- Tsvetkov et al. report that metaphoric sentences in their corpus were selected so as to contain only one metaphor, that is, the metaphoric occurrence of the target verb or adjective; the corpus presented in this thesis was compiled with the aim of approximating the experimental task to the demands of real-world NLP applications: therefore, it contains sentences which may feature multiple instances of figurative language as well as language errors and inaccuracies.

Next, most of the computational metaphor identification work for Russian that we are aware of also follows the top-down design, i.e. is aimed at identifying conceptual metaphors (Dodge et al., 2015; J. Dunn et al., 2014; Mohler et al., 2014; Strzalkowski et al., 2013). There are two experiments for identification of linguistic metaphor in Russian that are known to us: (1) Tsvetkov et al. (2013) and (2) Tsvetkov et al.(2014). However, the design of their experiments is substantially different from the experiments conducted in this thesis:

- the experiments by Tsvetkov and coauthors is based on cross-lingual model transfer: classification features in non-English languages are translated into English with an electronic bilingual dictionary and then they are vectorized using English lexical resources (such as WordNet, the MRC Psycholinguistic Database, or distributional semantic models); our experiments are monolingual: they neither depend on the quality of bilingual translation which may become problematic in cases of polysemy, nor do they require data from other languages and solely rely on resources that are currently available for Russian NLP.

- the experiments by Tsvetkov and colleagues operate on syntactically related pairs (Adjective-Noun) and triples (Subject-Verb-Object): as a consequence, they are dependent on the quality of syntactic parsing which is not always reliable in real-life tasks; our experiments take full sentential context as input: this enables us to explore the impact of contextual and discourse factors on identification of metaphor.

Scientific novelty of the thesis. As pointed out above, we see the main goal of this thesis in suggesting a linguistic explanation and interpretation of the language and discourse-based factors which promote the success of some computational models of linguistic metaphor identification and cause the other models to falter on the task. The output of a machine learning classifier is analyzed by means of statistical methods and other ML algorithms in order to arrive at empirical, data-driven conclusions about the linguistic mechanisms contributing to metaphor identification. To the best of our knowledge, this is the first attempt of such research.

Theoretical significance of the thesis. The findings of the present research may have a value for psycholinguistic and broader cognitive studies. The results presented in the thesis can shed light on the cognitive factors that make processing of metaphor by humans possible, since we explore the lexico-semantic and morphosyntactic cues which are deployed in carrying the signals of metaphoricity across from the speaker to the recipient. The results of this research can help to outline the inventory of metaphor cues and to evaluate their salience. As we look at metaphor in context and apply nonlinear (bag-of-items) representation, it allows us to make conclusions as to whether metaphor can be modelled as a holistic mental process in which the information carried by a verbalized message is a non-compositional unity of its constituent cues. Eventually, the the present research may have implications for efforts aimed at providing a computational model of the metaphor decoding and encoding process.

Practical significance of the thesis. The major contributions of this thesis can be summarized as the follows:^

- The research re-implemented the approaches to corpus annotation which had been suggested in earlier work on metaphor annotation in English. We introduced minor modification and applied the previously suggested protocols to Russian data.^

- We compiled a relatively large dataset of metaphorical and non-metaphorical usages of 20 Russian verbs, which is made available for public use. To the best of our knowledge, this is the first public resource of this kind.^

- An annotation validation experiment in a setting with multiple annotators was conducted.^


- We release a ranking of concreteness indexes for approximately 17K Russian words.^

- The study tested a number of earlier methodologies of feature extraction for metaphor identification in application to Russian (lexical and morphological frequencies, distributional semantic vectors, and concreteness scores).^

- We developed a classifier for sentence-level binary-class identification of metaphoric occurrences in raw running Russian text.^

- The thesis provides linguistic evaluation of the quality of classification and compares the efficiency of models based on different features.^

- We also suggest data-driven linguistic interpretation to the performance of the features and identify the features which hold potential for generalizability.^

- The thesis provides analysis aimed at an empirical verification of the theoretical claims that formed the basis of the computational models.

Public demonstrations of the results. The major results of the research were presented at the following events:

• The 2017 Spring Symposium Series of the Association for the Advancement of Artificial Intelligence (Stanford University, Computer Science Department; Palo-Alto, USA, 2017);

• The 2nd Kolmogorov Seminar on Computational Linguistics and Language Studies (National Research University Higher School of Economics, Moscow, Russia, 2017);

• Dialogue-2017, the 23rd International Conference on Computational Linguistics and Intellectual Technologies (Russian State University for the Humanities, Moscow, Russia, 2017);

• RuSSIR-2017, Russian Summer School in Information Retrieval (Ural State University, Yekaterinburg, Russia, 2017);

• The 3nd Kolmogorov Seminar on Computational Linguistics and Language Studies (National Research University Higher School of Economics, Moscow, Russia, 2018);

• AINL-2018, Artificial Intelligence and Natural Language Conference (ITMO University, Saint Petersburg, 2018)

• The 9th International Cognitive Linguistics Congress (National Research University Higher School of Economics, Nizhny Novgorod, 2019).

Note on collaboration

The initial experiments on Russian verbal metaphor identification with distributional semantic features (Panicheva & Badryzlova, 2017b) were led by Polina Panicheva in collaboration with the author of the thesis. All the other theoretical, experimental and composition work involved in the production of the thesis was carried out by the author alone.^

Organization of the thesis. The thesis consists of Introduction, four Chapters, Summary, and List of References comprising 206 titles.

Chapter I provides an overview of the state-of-the-art approaches to annotation of metaphor in corpora and to engineering computational systems for automated metaphor identification.

Chapter II is devoted to the experimental corpus - the principles of selecting data and target verbs, and annotating the corpus; the chapter also gives an outline of the metaphoric and non-metaphoric classes and describes the inter-annotator reliability test - the annotator instructions, annotations binarization schemes, and the obtained measure of agreement between the annotators; the last subsection of the chapter looks at the cases of inter-annotator disagreement.^

Chapter III details the metaphor identification experiment. It introduces the set of chosen features and explains the theoretical background which motivated the choice. The chapter goes on to describe the statistical approaches and computational resources which were applied in order to convert the input data into vectors, as well as the design of the machine learning experiment. The second half of the chapter discusses the results of the classification experiment: we compare the performance of models and evaluate the utility of increasing the model complexity.^

Chapter IV presents an in-depth analysis of the linguistic factors determining the performance of the models. We identify the linguistic units which are most likely to carry the signal of metaphoricity and make predictions about their generalizability. ^

Finally, we present the Conclusions of the thesis and make suggestions for future research in the area of computational identification of metaphor.^

Похожие диссертационные работы по специальности «Прикладная и математическая лингвистика», 10.02.21 шифр ВАК

Заключение диссертации по теме «Прикладная и математическая лингвистика», Бадрызлова Юлия Геннадьевна

Thesis summary

We have presented an attempt to conduct statistical modelling of metaphoric occurrences of Russian verbs in the context. The analysis of the obtained models enabled us to make non-trivial observations about the conceptual nature and the linguistic structure of metaphor and metaphoric discourse.

The contributions of the research can be summarized as the follows:

We release a new lexical resource - an annotated Russian corpus of linguistic metaphor. The corpus contains approximately 7,000 occurrences of verbal metaphor in context; to the best of our knowledge, this is the largest currently existing Russian corpus of such kind. The corpus is built around 20 polysemous target verbs: each sentence contains an occurrence of the target verb; the sentence is tagged as metaphoric when the target verb is used metaphorically, and as non-metaphoric when the target verb is used in its non-metaphoric sense. Such approach is similar to the design of the TroFi (Trope Finder) dataset by Birke and Sarkar (2006) which is made for the English language. The sentential contexts in our corpus are represented by unedited free text which was not controlled for conceptual complexity, i.e. each sentence can contain multiple occurrences of metaphor and metonymy outside of the target verb, which allows us to look at the natural behavior of the target metaphor in discourse and to explore the role of the context. In this regard our corpus differs from the dataset of Tsvetkov et al. (2014) in which sentences were selected so that each of them contained only one metaphorical occurrence located in the target verb. The metaphoric class in our corpus contains three types of metaphor: conventionalized and creative usages, and idiomatic expressions with the target verb. The annotation reliability test with three annotators yielded a high degree of inter-annotator agreement (0.83 and 0.9, under various conditions).

We suggest a statistical approach to quantifying the degree of association between metaphor and the constituent elements of the discourse - the index of metaphor association which is based on the AP metric (Ellis, 2006). In our experiments, the index of metaphor association is computed for lexical (lemma) unigrams and full morphosyntactic tags.

We provide a method for computing concreteness indexes of lexemes. The computation is based on a seed set of approximately 500 'thingness' paradigm words; for each word of the corpus, we use a pre-trained distributional semantic model to find its ten nearest neighbors among the

paradigm words, and take the average semantic distance as the concreteness index. The concreteness ranking of about 17,000 Russian lexemes is made publicly available.

Theoretical literature on metaphor studies (e.g. Goatly, 1997) placed substantial emphasis on the role of special lexical markers of metaphoricity (known as 'Flag words') for creating and indicating metaphoricity. We show that flag words (such as буквально 'literally', как будто, словно 'as if4, т.е. 'i.e.', and подобно 'like') as well as quotation marks cannot be efficiently used as features for classification due to sparse data.

We show that the features used in our classification experiments are sensitive to the size of the context window; in aggregate terms, the highest accuracy of classification is achieved on the context of full sentences. This observation accords with the study by Mu et al. (2019) which demonstrates that contextual features significantly enhance the quality of classification.

We conduct several series of classification experiments with models of different complexity (uni-, bi-, tri-, and four-feature models) based on the four types of features: (1) lexical co-occurrence, (2) morphosyntactic co-occurrence, (3) distributional semantic similarity, and (4) concreteness. We evaluate the accuracy of each model's performance of each of the 20 datasets of the individual target verbs, and on the combined dataset of the 20 target verbs. The mean accuracy across the 20 datasets yielded by the distributional semantic model is 0.67; the mean accuracy of the lexical co-occurrence model is 0.82; the mean accuracy of the morphosyntactic model is 0.74; the mean accuracy of the concreteness model is 0.74. The accuracy achieved on the combined dataset of the 20 verbs by the distributional semantic model is 0.65; by the lexical co-occurrence model - 0.82; by the morphosyntactic model - 0.67; by the concreteness model - 0.63. Combining several features in one classifier increases the accuracy of classification by 1-3 accuracy points. We claim that among the uni-feature models, the lexical co-occurrence model may hold the greatest promise for generalizability, since it achieves the highest result both on the combined dataset and across the 20 datasets, while maintaining stable performance. This result is in line with the findings of Klebanov, Leong, Heilman, and Flor (2014) who show that lexical unigrams prove very successful in linguistic metaphor classification.

We offer empirical support to the hypothesis that the efficiency of the lexical co-occurrence classifier may be related to the lexical homogeneity of one of the subcorpora (the metaphoric or the non-metaphoric ones). We measure the index of lexical diversity in the metaphoric and the non-metaphoric parts of each of the 20 individual target verbs datasets and find a strong negative correlation between the lexical diversity of the non-metaphoric subcorpus and the accuracy of classification. Thus, datasets with more lexically homogeneous non-metaphoric subcorpora are more likely to be classified with greater accuracy.

We attempt to induce the set of lexemes that are likely to serve as lexical cues of metaphoricity: we use an algorithm for evaluating the importance of lexical data points for classification and filter them by the variance of their distribution in the corpus. The resulting list contains 230 lexemes; we show that nouns are likely to bear greater importance as lexical cues of metaphoricity, since they are overrepresented in the list (as compared to the entire corpus).

We demonstrate that metaphor association of lexemes correlates with their concreteness indexes: lexemes with stronger metaphor association tend to be more abstract, while words that are associated with non-metaphoric contexts appear to be more concrete.

We outline the direction for future research in which combination of lexical metaphor association indexes and concreteness indexes can potentially be used for induction of conceptual metaphor mappings from the corpus. Sequential clustering of metaphor association and concreteness indexes yields clusters which vary in metaphor association, within which we separate abstract and concrete vocabulary. By adding statistical co-occurrence indexes it may be possible to construct a weighted directed graph in which nodes (lexemes) are connected by weighted edges (where weight is defined as the index of co-occurrence of the lexemes), and the edges are directed from lexemes with high metaphor association to lexemes with low metaphor association, and from abstract words to concrete ones.

We offer the hypothesis that the efficiency of the classifier based on distributional semantic similarity is likely to be related with the semantic homogeneity of one or both - the metaphoric and the non-metaphoric subcorpus.

We show that the efficiency of the morphosyntactic classifier on the datasets of the individual target verbs correlates with the degree to which the grammatical categories with high and low metaphor association are juxtaposed to each other in the corpus. We measure this degree of juxtaposition as the variance and slope of the curve formed by the metaphor association indexes of grammatical categories.

We show that the classifier operating on morphosyntactic features relies on idiosyncratic patterns of morphosyntactic combinability licensed by individual target verbs; therefore, morphosyntactic feature does not generalize well across the datasets.

Список литературы диссертационного исследования кандидат наук Бадрызлова Юлия Геннадьевна, 2019 год

The following papers have been published on the topic of the present thesis; three papers are devoted to computational metaphor identification and two to annotation of metaphor in corpus:

1. Badryzlova, Y. (2017). Opy't korpusnogo modelirovaniya faktorov metaforichnosti na primere russkix glagolov [A corpus-based study of factors and models of metaphoricity: evidence from Russian verbs]. Computational Linguistics and Intellectual Technologies,

2, 30-44. Moscow.

2. Badryzlova, Y., & Lyashevskaya, O. (2017). Metaphor Shifts in Constructions: The Russian Metaphor Corpus. The 2017 AAAI Spring Symposium Series: Technical Reports, 127-130.

3. Badryzlova, Y., Lyashevskaya, O., & Panicheva, P. (2019). Computer and metaphor: when lexicon, morphology, punctuation, and other beasts fail to predict sentence metaphoricity. Cognitive Studies of Language. Integrative Processes in Cognitive Linguistics, 37, 609-615. Nizhny Novgorod.

4. Badryzlova, Y., & Panicheva, P. (2018). A Multi-feature Classifier for Verbal Metaphor Identification in Russian Texts. Conference on Artificial Intelligence and Natural Language, 23-34. Springer.

5. Panicheva, P., & Badryzlova, Y. (2017). Distributional semantic features in Russian verbal metaphor identification. Computational Linguistics and Intellectual Technologies, 1, 179-190. Moscow.

List of References

Ahmad, K., Gillam, L., & Tostevin, L. (2000). Weirdness Indexing for Logical Document Extrapolation and Retrieval (WILDER). Proceedings of the Eighth Text Retrieval Conference (TREC-8), 1-8.

Allan, L. G. (1980). A note on measurement of contingency between two binary variables in judgment tasks. Bulletin of the Psychonomic Society, 15(3), 147-149.

Apresyan, V., Babaeva, E., Boguslavskaya, O., Galaktionova, I., Glovinskaya, M., Krylova, T., ... Uryson, E. (2014). The Active Dictionary of the Russian Language (1st ed., Vol. 2; Y. Apresjan, Ed.). Moscow: LRC Publishing House.

Apresyan, Y. (1995). Leksicheskaya semantika. Sinonimicheskiye sredstva yazyka [Lexical semantics. The synonymic means of the language] (2nd ed.). Moscow: Yazy'ki slavyanskoj kul'tury' (Languages of Russian Culture Publishing House).

Apresyan, Y. (2009). Issledovanijapo semantike i leksikografii: Paradigmatika. [The study of semantics and lexicography: The paradigmatic aspect.] (Vol. 1). Moscow: Yazy'ki slavyanskoj kul'tury' (Languages of Russian Culture Publishing House).

Badryzlova, Y., & Panicheva, P. (2018). A Multi-feature Classifier for Verbal Metaphor Identification in Russian Texts. Conference on Artificial Intelligence and Natural Language, 23-34. Springer.

Baroni, M., Dinu, G., & Kruszewski, G. (2014). Don't count, predict! A systematic comparison of context-counting vs. Context-predicting semantic vectors. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, 238-247.

Barsalou, L. W. (2008). Grounded cognition. Annu. Rev. Psychol., 59, 617-645.

Barsalou, L. W. (2010). Grounded cognition: Past, present, and future. Topics in Cognitive Science, 2(4), 716-724.

Benko, V., & Zakharov, V. (2016). Very large Russian corpora: New opportunities and new challenges. In Computational linguistics and intellectual technologies. Russian State University for the Humanities.

Birke, J., & Sarkar, A. (2006). A clustering approach for nearly unsupervised recognition of nonliteral language. 11th Conference of the European Chapter of the Association for Computational Linguistics.

Blanchard, D., Tetreault, J., Higgins, D., Cahill, A., & Chodorow, M. (2013). TOEFL11: A corpus of non-native English. ETSResearch Report Series, 2013(2), i—15.

Boguslavsky, I. (2014). SynTagRus-a Deeply Annotated Corpus of Russian. Les Émotions Dans Le Discours-Emotions in Discourse, 367-380.

Bracewell, D. B., Tomlinson, M. T., Mohler, M., & Rink, B. (2014). A tiered approach to the recognition of metaphor. International Conference on Intelligent Text Processing and Computational Linguistics, 403-414. Springer.

Broadwell, G. A., Boz, U., Cases, I., Strzalkowski, T., Feldman, L., Taylor, S., ... Webb, N.

(2013). Using imageability and topic chaining to locate metaphors in linguistic corpora. International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction, 102-110. Springer.

Brugman, C. (1988). The syntax and semantics of HAVE and its complements.

Brysbaert, M., Warriner, A. B., & Kuperman, V. (2014). Concreteness ratings for 40 thousand generally known English word lemmas. Behavior Research Methods, 46(3), 904-911.

Bulat, L., Clark, S., & Shutova, E. (2017). Modelling metaphor with attribute-based semantics. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, 2, 523-528.

Calvo, P., & Gomila, T. (2008). Handbook of cognitive science: An embodied approach. Elsevier.

Cameron, L. (2003). Metaphor in educational discourse. Retrieved from




Charteris-Black, J. (2004). Corpus approaches to critical metaphor analysis. Retrieved from http://eprints.uwe.ac.uk/6739/

Chilton, P. A. (1996). Security metaphors: Cold war discourse from containment to common house (Vol. 2). Lang New York.

Clark, S. (2015). Vector space models of lexical meaning. Handbook of Contemporary Semantics, 10, 9781118882139.

Coltheart, M. (1981a). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology Section A, 33(4), 497-505.

Coltheart, M. (1981b). The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology, 33(4), 497-505.

Cressie, N., & Read, T. R. (1984). Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society. Series B (Methodological), 440-464.

Crisp, P. (2002). Metaphorical propositions: A rationale. Language and Literature, 11(1), 7-16.

Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American Society for Information Science, 41(6), 391-407.

Dodge, E., Hong, J., & Stickles, E. (2015). MetaNet: Deep semantic automatic metaphor analysis. NAACL HLT2015, 40.

Droganova, K., & Medyankin, N. (2016). NLP pipeline for Russian: An easy-to-use web application for morphological and syntactic annotation. Proceedings of the Annual International Conference "Dialogue ". Presented at the Annual International Conference "Dialogue", Moscow.

Dunn, J. (2013a). Evaluating the premises and results of four metaphor identification systems. International Conference on Intelligent Text Processing and Computational Linguistics, 471-486. Retrieved from http://link.springer.com/chapter/10.1007/978-3-642-37247-6_38

Dunn, J. (2013b). What metaphor identification systems can tell us about metaphor-in-language. Proceedings of the First Workshop on Metaphor in NLP, 1-10. Retrieved from https://www.aclweb.org/anthology/W /W13/W13 -09.pdf#page= 11

Dunn, J., de Heredia, J. B., Burke, M., Gandy, L., Kanareykin, S., Kapah, O., ... Grossman, D. (2014). Language-Independent Ensemble Approaches to Metaphor Identification. Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence.

Dunn, J. E. (2013). Automatic identification of metaphoric utterances. Purdue University.

Ellis, N. C. (2006). Language acquisition as rational contingency learning. Applied Linguistics, 27(1), 1-24.

Ellis, N. C., & Ferreira-Junior, F. (2009). Constructions and their acquisition: Islands and the distinctiveness of their occupancy. Annual Review of Cognitive Linguistics, 7(1), 188221.

Evert, S. (2005). The statistics of word cooccurrences: Word pairs and collocations.

Fass, D. (1991). met*: A method for discriminating metonymy and metaphor by computer. Computational Linguistics, 17(1), 49-90.

Fauconnier, G., & Turner, M. (2002). The way we think: Conceptual blending and the mind's hidden complexities. Basic Books.

Fellbaum, C. (1998). WordNet: An electronic database. MIT Press, Cambridge, MA.

Fenogenova, A., Kayutenko, D., & Dereza, O. (2015). Mystem+. Retrieved from http://web-corpora.net/wsgi/mystemplus.wsgi/mystemplus/

Fillmore, C. J. (1985). Syntactic intrusions and the notion of grammatical construction. Annual Meeting of the Berkeley Linguistics Society, 11, 73-86.

Fillmore, C. J. (1988). The mechanisms of "construction grammar". Annual Meeting of the Berkeley Linguistics Society, 14, 35-55.

Fillmore, C. J., Kay, P., & O'Connor, M. C. (1988). Regularity and idiomaticity in grammatical constructions: The case of let alone. Language, 501-538.

Firth, J. R. (1957). A synopsis of linguistic theory, 1930-1955. Studies in Linguistic Analysis.

Flusberg, S. J., Matlock, T., & Thibodeau, P. H. (2017). Metaphors for the war (or race) against climate change. Environmental Communication, 11(6), 769-783.

Gallagher, L., McAuley, J., & Moseley, G. L. (2013). A randomized-controlled trial of using a book of metaphors to reconceptualize pain and decrease catastrophizing in people with chronic pain. The Clinical Journal of Pain, 29(1), 20-25.

Gandy, L., Allan, N., Atallah, M., Frieder, O., Howard, N., Kanareykin, S., ... Argamon, S. (2013). Automatic Identification of Conceptual Metaphors With Limited Knowledge. AAAI. Retrieved from https://cps-xena.cps.cmich.edu/lgandy/automatic_metaphor.pdf

Gedigian, M., Bryant, J., Narayanan, S., & Ciric, B. (2006). Catching metaphors. Proceedings of the Third Workshop on Scalable Natural Language Understanding, 41-48. Association for Computational Linguistics.

Gentner, D., & Bowdle, B. F. (2001). Convention, form, and figurative language processing. Metaphor and Symbol, 16(3-4), 223-247.

Gibbs Jr, R. W. (2006). Introspection and cognitive linguistics: Should we trust our own intuitions? Annual Review of Cognitive Linguistics, 4(1), 135-151.

Gibbs, R. W., & Gibbs, R. W. (1994). The poetics of mind: Figurative thought, language, and understanding. Cambridge University Press.

Goatly, A. (1997). The language of metaphors (Vol. 37). Retrieved from


Goldberg, A. E. (1995). Constructions: A construction grammar approach to argument structure. University of Chicago Press.

Goossens, L. (2002). Metaphtonymy: The interaction of metaphor and metonymy in expressions for linguistic. Metaphor and Metonymy in Comparison and Contrast, 20, 349.

Grady, J. (1997). Foundations of meaning: Primary metaphors and primary scenes.

Graff, D., Kong, J., Chen, K., & Maeda, K. (2003). English gigaword. Linguistic Data Consortium, Philadelphia, 4(1), 34.

Gurin, G., & Belikova, A. (2012). Metodika ocenki konvencional'nosti metaforicheskix

vy'razhenij: Ot intuitivistskix kriteriev k operacional'ny'm [A procedure for evaluating degree of conventionality of metaphor expressions: From intuition to operational

criteria]. Proceedings of the Annual International Conference "Dialogue", 1, 187-197. Moscow, Russia.

Halliday, M. A. K., & Hasan, R. (1976). Cohesion in english (1st ed.). Longman. Hamp, B., & Feldweg, H. (1997). Germanet-a lexical-semantic net for german. Automatic Information Extraction and Building of Lexical Semantic Resources for NLP Applications.

Harris, Z. S. (1954). Distributional structure. Word, 10(2-3), 146-162.

Haser, V. (2005). Metaphor, metonymy, and experientialistphilosophy: Challenging cognitive

semantics (Vol. 49). Walter de Gruyter. Hauser, D. J., & Schwarz, N. (2015). The war on prevention: Bellicose cancer metaphors hurt

(some) prevention intentions. Personality and Social Psychology Bulletin, 41(1), 66-77. Heintz, I., Gabbard, R., Srinivasan, M., Barner, D., Black, D. S., Freedman, M., & Weischedel, R. (2013). Automatic extraction of linguistic metaphor with lda topic modeling. Proceedings of the First Workshop on Metaphor in NLP, 58-66. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi= #page=68

Heintz, I., Gabbard, R., Srivastava, M., Barner, D., Black, D., Friedman, M., & Weischedel, R. (2013). Automatic extraction of linguistic metaphors with lda topic modeling. Proceedings of the First Workshop on Metaphor in NLP, 58-66. Hendricks, R. K., & Boroditsky, L. (2016). Emotional implications of metaphor: Consequences of metaphor framing for mindset about hardship. Proceedings of the 38th Annual Conference of the Cognitive Science Society, 1164-1169. Herbelot, A., & Kochmar, E. (2016). 'Calling on the classical phone': A distributional model of adjective-noun errors in learners' English. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 976-986. Hovy, D., Berg-Kirkpatrick, T., Vaswani, A., & Hovy, E. (2013). Learning whom to trust with MACE. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1120-1130. Hovy, D., Srivastava, S., Jauhar, S. K., Sachan, M., Goyal, K., Li, H., ... Hovy, E. (2013).

Identifying metaphorical word use with tree kernels. Proceedings of the First Workshop on Metaphor in NLP, 52-57. Retrieved from

http://citeseerx.ist.psu.edu/viewdoc/download?doi= age=62

Huang, E. H., Socher, R., Manning, C. D., & Ng, A. Y. (2012). Improving word representations via global context and multiple word prototypes. Proceedings of the 50th Annual Meeting


of the Association for Computational Linguistics: Long Papers-Volume 1, 873-882. Association for Computational Linguistics. Jackendoff, R., & Aaron, D. (1991). More than cool reason: A field guide to poetic metaphor by

George Lakoff and Mark Turner. Language, 67(2), 320-338. Jia, L., & Smith, E. R. (2013). Distance makes the metaphor grow stronger: A psychological

distance model of metaphor use. Journal of Experimental Social Psychology, 49(3), 492497.

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., ... Darrell, T. (2014). Caffe: Convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM International Conference on Multimedia, 675-678. ACM. Johnson, C. (1996). Learnability in the acquisition of multiple senses: SOURCE reconsidered.

Annual Meeting of the Berkeley Linguistics Society, 22, 469-480. Johnson, C. (1999). Metaphor vs. Conflation in the Acquisition of Polysemy: The Case of

SEE.''. Cultural, Psychological and Typological Issues in Cognitive Linguistics: Selected Papers of the Bi-Annual ICLA Meeting in Albuquerque, July 1995, 152, 155. John Benjamins Publishing.

Katz, J. J., & Fodor, J. A. (1963). The structure of a semantic theory. Language, 39(2), 170-210. Kilgarriff, A., Baisa, V., Busta, J., Jakubicek, M., Kovar, V., Michelfeit, J., ... Suchomel, V.

(2014). The Sketch Engine: Ten years on. Lexicography, 1(1), 7-36.

Kipper, K., Korhonen, A., Ryant, N., & Palmer, M. (2006). Extensive classifications of english

verbs. Proceedings of the 12 th EURALEX International Congress, 4, 5-2. Kiros, R., Zhu, Y., Salakhutdinov, R. R., Zemel, R., Urtasun, R., Torralba, A., & Fidler, S.

(2015). Skip-thought vectors. Advances in Neural Information Processing Systems, 3294-3302.

Klebanov, B. B., & Flor, M. (2013). Argumentation-relevant metaphors in test-taker essays.

Proceedings of the First Workshop on Metaphor in NLP, 11-20. Klebanov, B. B., Leong, B., Heilman, M., & Flor, M. (2014). Different texts, same metaphors: Unigrams and beyond. Proceedings of the Second Workshop on Metaphor in NLP, 1117.

Klebanov, B. B., Leong, C. W., Gutierrez, E. D., Shutova, E., & Flor, M. (2016). Semantic

classifications for detection of verb metaphors. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2, 101-106. Klebanov, B. B., Shutova, E., & Lichtenstein, P. (2014). Proceedings of the Second Workshop on Metaphor in NLP. Proceedings of the Second Workshop on Metaphor in NLP. Presented at the Baltimore, MD. Retrieved from http://aclweb.org/anthology/W14-2300


Klebanov, B. B., Shutova, E., & Lichtenstein, P. (2016). Proceedings of the Fourth Workshop on Metaphor in NLP. Proceedings of the Fourth Workshop on Metaphor in NLP. Presented at the San Diego, California. Retrieved from http://aclweb.org/anthology/W16-1100

Koller, V. (2004). Metaphor and gender in business media discourse: A critical cognitive study. Springer.

Kovecses, Z. (1995). American friendship and the scope of metaphor. Cognitive Linguistics, 6, 315-346.

Kovecses, Z. (2010). Metaphor: A Practical Introduction, 2nd Edition (2e edition). Oxford; New York: Oxford University Press.

Krennmayr, T. (2013a). Adding transparency to the identification of cross-domain mappings in real language data. Review of Cognitive Linguistics. Published under the Auspices of the Spanish Cognitive Linguistics Association, 11(1), 163-184.

Krennmayr, T. (2013b). Top-down versus bottom-up approaches to the identification of metaphor in discourse. Metaphorik. De, 24, 7-36.

Krishnakumaran, S., & Zhu, X. (2007). Hunting elusive metaphors using lexical resources.

Proceedings of the Workshop on Computational Approaches to Figurative Language, 13-20. Retrieved from http://dl.acm.org/citation.cfm?id=1611531

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep

convolutional neural networks. Advances in Neural Information Processing Systems, 1097-1105.

Kulagin, D. (2017). Kartaslov. Retrieved from https://github.com/dkulagin/kartaslov (Original work published 2017)

Kulagin, D. (2019). Developing computationally verifiable semantic annotation of Russian nouns [Opy 't sozdaniya mashinno-proveryaemoj semanticheskoj razmetki russkix sushhestviteVny 'x]. Presented at the Annual International Conference "Dialogue", Moscow. Retrieved from http://www.dialog-21.ru/media/4866/kulagin.pdf

Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. The Annals of Mathematical Statistics, 22(1), 79-86.

Kursa, M., Jankowski, A., & Rudnicki, W. (2010). Boruta-a system for feature selection. Fundamenta Informaticae, 101(4), 271-285.

Kursa, M., & Rudnicki, W. (2018). Boruta: Wrapper Algorithm for All Relevant Feature

Selection (Version 6.0.0). Retrieved from https://CRAN.R-project.org/package=Boruta

Kustova, G., Lyashevskaya, O., Paducheva, E., & Rakhilina, E. (2005). Semantic annotation of lexicon in Russian National Corpus: Principles, issues, and future directions [Semanticheskaya razmetka leksiki v Nacionafnom korpuse russkogo yazy'ka:

Principy , problemy\ perspektivy ]. In Russian National Corpus: 2003-2005. Results and future directions [NacionaVnyj korpus russkogo yazy^ka: 2003-2005. Rezul'taty" i perspektivy] (pp. 155-174).

Kutuzov, A., & Kuzmenko, E. (2016). WebVectors: A toolkit for building web interfaces for vector semantic models. International Conference on Analysis of Images, Social Networks and Texts, 155-161. Springer.

Lakoff, G. (1986). The meanings of literal. Metaphor and Symbol, 7(4), 291-296.

Lakoff, G. (1993). The Contemporary Theory of Metaphor. In A. Ortony (Ed.), Metaphor and Thought (2nd ed.). Cambridge: Cambridge University Press.

Lakoff, G. (2008). Women, fire, and dangerous things. University of Chicago press.

Lakoff, G., Espenson, J., & Schwartz, A. (1991). Master Metaphor List. University of California at Berkely.

Lakoff, G., & Johnson, M. (1980a). Metaphors We Live By (2nd ed.). Chicago-London: The University of Chicago Press.

Lakoff, G., & Johnson, M. (1980b). The metaphorical structure of the human conceptual system. Cognitive Science, 4(2), 195-208.

Lakoff, G., & Johnson, M. (1999). Philosophy in the Flesh (Vol. 4). New york: Basic books.

Lambrecht, K. (1994). Information structure and sentence form: Topic, focus, and the mental representations of discourse referents (Vol. 71). Cambridge university press.

Landau, M. J., Keefer, L. A., & Rothschild, Z. K. (2014). Epistemic motives moderate the effect of metaphoric framing on attitudes. Journal of Experimental Social Psychology, 53, 125138.

Landau, M. J., Oyserman, D., Keefer, L. A., & Smith, G. C. (2014). The college journey and academic engagement: How metaphor use enhances identity-based motivation. Journal of Personality and Social Psychology, 706(5), 679.

Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato's problem: The latent semantic

analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 704(2), 211.

Lau, J. H., & Baldwin, T. (2016). An empirical evaluation of doc2vec with practical insights into document embedding generation. ArXiv Preprint ArXiv:7607.05368.

Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. International Conference on Machine Learning, 1188-1196.

Leech, G. (n.d.). BNC: a brief users' guide to the grammatical tagging of the British National Corpus. Retrieved 30 May 2019, from http://www.natcorp.ox.ac.uk/docs/gramtag.html

Leezenberg, M. (2001). Contexts of metaphor. Brill.

Lenci, A. (2018). Distributional models of word meaning. Annual Review of Linguistics, 4, 151171.

Leong, C. W. B., Klebanov, B. B., & Shutova, E. (2018). A report on the 2018 VUA metaphor detection shared task. Proceedings of the Workshop on Figurative Language Processing, 56-66.

Levin, B. (1993). English verb classes and alternations: A preliminary investigation. University of Chicago press.

Levin, L. S., Mitamura, T., MacWhinney, B., Fromm, D., Carbonell, J. G., Feely, W., ...

Ramirez, C. (2014). Resources for the Detection of Conventionalized Metaphors in Four Languages. LREC, 498-501. Levshina, N. (2015). How to do linguistics with R: Data exploration and statistical analysis.

John Benjamins Publishing Company. Liu, T., Cho, K., Broadwell, G. A., Shaikh, S., Strzalkowski, T., Lien, J., ... Webb, N. (2014). Automatic Expansion of the MRC Psycholinguistic Database Imageability Ratings. LREC, 2800-2805.

Lopukhina, A., & Lopukhin. (2017). Word Sense Frequency Estimation for Russian: Verbs, Adjectives, and Different Dictionaries. Electronic Lexicography in the 21st Century. Proceedings of ELex 2017 Conference, 267-280. Brno. Lopukhina, A., Lopukhin, K., & Nosyrev, G. (2018). Automated Word Sense Frequency

Estimation for Russian Nouns. In Quantitative approaches to the Russian language (pp. 79-94). Routledge.

Loukachevitch, N. (2010). Thesauri for information retrieval tasks [Tezaurusy' v zadachax

informacionnogopoiska]. Moscow: Moscow State University Publishing House. Low, G. (1999). Validating metaphor research projects. Researching and Applying Metaphor, 48-65.

Lyashevskaya, O., & Sharoff, S. (2009). Chastotnyj slovars sovremennogo russkogo yazy'ka na materialax Nacionalnogo korpusa russkogo yazy 'ka / A frequency dictionary of modern Russian language on the basis of the Russian National Corpus. Azbukovnik. Magnuson, W. (1995). English idioms: Sayings and slang. Calgary: Prairie house books. Manning, C., & Schütze, H. (1999). Foundations of statistical natural language processing. MIT press.

Martin, J. H. (1990). A computational model of metaphor interpretation. Academic Press Professional, Inc.

Martin, J. H. (1994). Metabank: A knowledge-base of metaphoric language conventions. Computational Intelligence, 10(2), 134-149.

McRae, K., Cree, G. S., Seidenberg, M. S., & McNorgan, C. (2005). Semantic feature

production norms for a large set of living and nonliving things. Behavior Research Methods, 37(4), 547-559.

Mehlig, H. R. (1985). Semantika predlozheniya i semantika vida v russkom yazy'ke / The

semantics of the sentence and the semantics of the aspect in the Russian language. In T. V. Bulygina & A. E. Kibrik (Eds.), Novoe v zarubezhnoj lingvistike /Advances in international linguistics (pp. 227-249). Moscow: Progress.

Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. ArXiv Preprint ArXiv:1301.3781.

Miller, G. A. (1995). WordNet: A lexical database for English. Communications of the ACM, 38(11), 39-41.

Mio, J. S., Thompson, S. C., & Givens, G. H. (1993). The commons dilemma as metaphor: Memory, influence, and implications for environmental conservation. Metaphor and Symbol, 8(1), 23-42.

Mohammad, S., Shutova, E., & Turney, P. (2016). Metaphor as a medium for emotion: An

empirical study. Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, 23-33.

Mohler, M., Bracewell, D., Hinote, D., & Tomlinson, M. (2013). Semantic signatures for

example-based linguistic metaphor detection. Proceedings of the First Workshop on Metaphor in NLP, 27-35. Retrieved from http://anthology.aclweb.org/WZW13/W13-09.pdf#page=37

Mohler, M., Brunson, M., Rink, B., & Tomlinson, M. T. (2016). Introducing the LCC Metaphor Datasets. LREC.

Mohler, M., Rink, B., Bracewell, D. B., & Tomlinson, M. T. (2014). A Novel Distributional Approach to Multilingual Conceptual Metaphor Recognition. COLING, 1752-1763. Retrieved from http://www.aclweb.org/anthology/C14-1165

Morkovkin, V. (1970). Ideographic dictionaries [Ideograficheskie slovari]. Moscow: Moscow State University Publishing House.

Morris, M. W., Sheldon, O. J., Ames, D. R., & Young, M. J. (2007). Metaphors and the market: Consequences and preconditions of agent and object metaphors in stock market commentary. Organizational Behavior and Human Decision Processes, 102(2), 174-192.

Mu, J., Yannakoudakis, H., & Shutova, E. (2019). Learning Outside the Box: Discourse-level Features Improve Metaphor Identification. ArXiv Preprint ArXiv:1904.02246.

Murphy, G. L. (1996). On metaphoric representation. Cognition, 60(2), 173-204.

Murphy, G. L. (1997). Reasons to doubt the present evidence for metaphoric representation.

Cognition, 62(1), 99-108. Musloff, A. (2004). Metaphor and political discourse: Analogical reasoning in debates about

Europe. Houndmills, Basingstoke, Hampshire: Palgrave Macmillan. Narayanan, S. (1997). Embodiment in language understanding: Sensory-motor representations for metaphoric reasoning about event descriptions. University of California, Berkeley: Unpublished Doctoral Dissertation. Neuman, Y., Assaf, D., Cohen, Y., Last, M., Argamon, S., Howard, N., & Frieder, O. (2013). Metaphor Identification in Large Texts Corpora. PLOS ONE, 8(4), e62343. https://doi.org/10.1371/journal.pone.0062343 Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic evaluation of topic

coherence. Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 100-108. Association for Computational Linguistics. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., Kubler, S., ... Marsi, E. (2007).

MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(2), 95-135. Ovchinnikova, E., Israel, R., Wertheim, S., Zaytsev, V., Montazeri, N., & Hobbs, J. (2014). Abductive inference for interpretation of metaphors. Proceedings of the Second Workshop on Metaphor in NLP, 33-41. Retrieved from

https://pdfs.semanticscholar.org/64c2/9feb317f54f38a4b61aa1a4b619cd3b90018.pdf#pag e=43

Oxford Text Archive, B. L. (n.d.). British National Corpus, Baby edition [Text]. Retrieved 29

August 2019, from http://ota.ox.ac.uk/desc/2553 Panicheva, P., & Badryzlova, Y. (2017). Distributional semantic features in Russian verbal metaphor identification. Computational Linguistics and Intellectual Technologies, 1, 179-190.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... Dubourg, V. (2011). Scikit-learn: Machine Learning in Python Journal of Machine Learning Research.

Pennington, J., Socher, R., & Manning, C. (2014). Glove: Global vectors for word

representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532-1543. Peters, M. E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, K., & Zettlemoyer, L.

(2018). Deep contextualized word representations. ArXiv Preprint ArXiv:1802.05365.

Pezzulo, G., Barsalou, L. W., Cangelosi, A., Fischer, M. H., McRae, K., & Spivey, M. J. (2013). Computational Grounded Cognition: A new alliance between grounded cognition and computational modeling. Frontiers in Psychology, 3. https://doi.org/10.3389/fpsyg.2012.00612

Pragglejaz Group. (2007). MIP: A method for identifying metaphorically used words in discourse. Metaphor and Symbol, 22(1), 1-39.

Radden, G. (2002). How metonymic are metaphors. Metaphor and Metonymy in Comparison and Contrast, 407-434.

Radman, Z. (1997). Difficulties with diagnosing the death of a metaphor. In Metaphors: Figures of the Mind (pp. 31-39). Springer.

Randall, B., Moss, H. E., Rodd, J. M., Greer, M., & Tyler, L. K. (2004). Distinctiveness and correlation in conceptual structure: Behavioral and computational studies. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30(2), 393.

Rehurek, R., & Sojka, P. (2010). Software framework for topic modelling with large corpora. In Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Citeseer.

Ritchie, D. (2003). ' ARGUMENT IS WAR'-Or is it a Game of Chess? Multiple Meanings in the Analysis of Implicit Metaphors. Metaphor and Symbol, 18(2), 125-146.

Robins, S., & Mayer, R. E. (2000). The metaphor framing effect: Metaphorical reasoning about text-based dilemmas. Discourse Processes, 30(1), 57-86.

Rosen, Z. (2018). Computationally Constructed Concepts: A Machine Learning Approach to Metaphor Interpretation Using Usage-Based Construction Grammatical Cues. Proceedings of the Workshop on Figurative Language Processing, 102-109.

Rundell, M. (Ed.). (2002). Macmillan English dictionary for advanced learners. Oxford: Macmillan publishers.

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... Bernstein, M. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211-252.

Russian National Corpus. (n.d.). Retrieved 9 May 2019, from http://www.ruscorpora.ru/en/index.html

RusVectores: Semanticheskie modeli dlya russkogo yazy'ka [RusVectores: semantic models for the Russian language]. (n.d.). Retrieved 25 April 2019, from RusVectores website: https://rusvectores.org/ru/

Sandhaus, E. (2008). The new york times annotated corpus. Linguistic Data Consortium, Philadelphia, 6(12), e26752.

Sardinha, T. B. (2008). Metaphor probabilities in corpora. PRAGMATICS AND BEYOND NEW SERIES, 173, 127.

Scherer, A. M., Scherer, L. D., & Fagerlin, A. (2015). Getting ahead of illness: Using metaphors to influence medical decision making. Medical Decision Making, 35(1), 37-45.

Schmid, H. (1994). Probabilistic Part-of-speech Tagging Using Decision Trees. International Conference on New Methods in Language Processing. Presented at the International Conference on New Methods in Language Processing, Manchester, UK. Retrieved from http://www.cis.uni-muenchen.de/~schmid/tools/TreeTagger/data/tree-tagger1.pdf

Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681.

Segalovich, I. (2003). A fast morphological algorithm with unknown word guessing induced by a dictionary for a web search engine. MLMTA, 273-280. Citeseer.

Semino, E., Heywood, J., & Short, M. (2004). Methodological problems in the analysis of metaphors in a corpus of conversations about cancer. Journal of Pragmatics, 36(7), 1271-1294.

Sense frequencies with Russian Active Dictionary. (n.d.). Retrieved 9 May 2019, from http://sensefreq.ruslang.ru/

Shutova, E., Kiela, D., & Maillard, J. (2016). Black holes and white rabbits: Metaphor

identification with visual features. Proc. of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 160-170. Retrieved from


Shutova, E., Klebanov, B. B., & Lichtenstein, P. (Eds.). (2015). Proceedings of the Third

Workshop on Metaphor in NLP. Retrieved from http://www.aclweb.org/anthology/W15-14

Shutova, E., Klebanov, B. B., Tetreault, J., & Kozareva, Z. (2013). Proceedings of the First Workshop on Metaphor in NLP. Proceedings of the First Workshop on Metaphor in NLP. Presented at the Atlanta, Georgia. Retrieved from http://aclweb.org/ anthol ogy/W 13 -0900

Shutova, E., & Sun, L. (2013). Unsupervised metaphor identification using hierarchical graph factorization clustering. Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 978-988.

Shutova, E., Sun, L., Gutiérrez, E. D., Lichtenstein, P., & Narayanan, S. (2016). Multilingual Metaphor Processing: Experiments with Semi-Supervised and Unsupervised Learning.


Computational Linguistics. Retrieved from http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00275

Shutova, E., & Teufel, S. (2010). Metaphor Corpus Annotated for Source-Target Domain Mappings. LREC, 2, 2-2. Retrieved from http://lexitron.nectec.or.th/public/LREC-2010_Malta/pdf/612_Paper.pdf

Shutova, E., Teufel, S., & Korhonen, A. (2013). Statistical metaphor processing. Computational Linguistics, 39(2), 301-353.

Siegal, S. (1956). Nonparametric statistics for the behavioral sciences. McGraw-hill.

Sitchinava, D. (2011). Gender. Essays for the project of corpus description of Russian grammar [Rod. Materialy^ dlyaproekta korpusnogo opisaniya russkoj grammatiki]. Moscow.

Steen, G. (1999). From linguistic to conceptual metaphor in five steps. Amsterdam Studies in the Theory and History of Linguistic Science Series 4, 57-78.

Steen, G. (2007). Finding metaphor in grammar and usage: A methodological analysis of theory and research (Vol. 10). Retrieved from




Steen, G. (2009). From linguistic form to conceptual structure in five steps: Analyzing metaphor in poetry. Cognitive Poetics, 197-226.

Steen, G., Herrmann, B., Kaal, A., Krennmayr, T., & Pasma, T. (2010). A Method for Linguistic Metaphor Identification: From MIP to MIPVU. Amsterdam; Philadelphia, PA: John Benjamins Publishing Company.

Steen, G. J., A.G.Dorst, J.B.Herrmann, A.A.Kaal, T.Krennmayr, & T.Pasma. (2010). A method for linguistic metaphor identification: From MIP to MIPVU. Amsterdam: John Benjamins.

Stemle, E. (2016a). bot. Zen@ EmpiriST 2015-A minimally-deep learning PoS-tagger (trained for German CMC and Web data). 10th Web as Corpus Workshop (WAC-X) and the EmpiriST Shared Task, 115-119.

Stemle, E. (2016b). bot. Zen@ EVALITA 2016-A minimally-deep learning PoS-tagger (trained for Italian Tweets). CLiC-It/EVALITA.

Stemle, E., & Onysko, A. (2018). Using Language Learner Data for Metaphor Detection. Proceedings of the Workshop on Figurative Language Processing, 133-138.

Stowe, K., & Palmer, M. (2018). Leveraging Syntactic Constructions for Metaphor

Identification. Proceedings of the Workshop on Figurative Language Processing, 17-26.

Strzalkowski, T., Broadwell, G. A., Taylor, S., Feldman, L., Yamrom, B., Shaikh, S., ... others. (2013). Robust extraction of metaphors from novel data. Proceedings of the First Workshop on Metaphor in NLP, 67-76. Retrieved from http://www.cl.cam.ac.uk/~es407/papers/meta4NLP2013.pdf#page=77 Thibodeau, P. H., & Boroditsky, L. (2011). Metaphors We Think With: The Role of Metaphor in

Reasoning. PLoS ONE, 6(2), e16782. https://doi.org/10.1371/journal.pone.0016782 Tsvetkov, Y., Boytsov, L., Gershman, A., Nyberg, E., & Dyer, C. (2014). Metaphor detection with cross-lingual model transfer. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1, 248-258. Tsvetkov, Y., Mukomel, E., & Gershman, A. (2013). Cross-lingual metaphor detection using

common semantic features. Proceedings of the First Workshop on Metaphor in NLP, 4551.

Tsvetkov, Y., Schneider, N., Hovy, D., Bhatia, A., Faruqui, M., & Dyer, C. (2014). Augmenting

English adjective senses with supersenses. Turney, P. D., Neuman, Y., Assaf, D., & Cohen, Y. (2011). Literal and metaphorical sense

identification through concrete and abstract context. Proceedings of the Conference on Empirical Methods in Natural Language Processing, 680-690. Retrieved from http://dl.acm.org/citation.cfm?id=2145511 Turney, P. D., & Pantel, P. (2010). From frequency to meaning: Vector space models of

semantics. Journal of Artificial Intelligence Research, 37, 141-188. Tyler, L. K., Moss, H. E., Durrant-Peatfield, M. R., & Levy, J. P. (2000). Conceptual structure and the structure of concepts: A distributed account of category-specific deficits. Brain and Language, 75(2), 195-231. Varela, F. J., Thompson, E., & Rosch, E. (2017). The embodied mind: Cognitive science and

human experience. MIT press. Veale, T. (2018). The "default" in our stars: Signposting non-defaultness in ironic discourse.

Metaphor and Symbol, 33(3), 175-184. https://doi.org/10.1080/10926488.2018.1481262 Veale, T., Shutova, E., & Klebanov, B. B. (2016). Metaphor: A Computational Perspective. Synthesis Lectures on Human Language Technologies, 9(1), 1-160. https://doi .org/10.2200/S00694ED 1V01Y201601HLT031 Vendler, Z. (1957). Verbs and times. The Philosophical Review, 66(2), 143-160. Vervaeke, J., & Green, C. D. (1997). Women, fire, and dangerous theories: A critique of

Lakoffs theory of categorization. Metaphor and Symbol, 12(1), 59-80. Vervaeke, J., & Kennedy, J. M. (1996). Metaphors in language and thought: Falsification and multiple meanings. Metaphor and Symbol, 11(4), 273-284.

Wilks, Y. (1978). Making preferences more active. Artificial Intelligence, 11(3), 197-223. Wilson, M. (1988). MRC Psycholinguistic Database: Machine-usable dictionary, version 2.00.

Behavior Research Methods, Instruments, & Computers, 20(1), 6-10. Wu, C., Wu, F., Chen, Y., Wu, S., Yuan, Z., & Huang, Y. (2018). Neural Metaphor Detecting with CNN-LSTM Model. Proceedings of the Workshop on Figurative Language Processing, 110-114. New Orleans, Louisiana. Yevgenyeva, A. (Ed.). (1981). Dictionary of the Russian Language (2nd ed., Vols 1-4).

Moscow: Academy of Sciences of the USSR; Russian Language Institute. Zhu, Y., Kiros, R., Zemel, R., Salakhutdinov, R., Urtasun, R., Torralba, A., & Fidler, S. (2015). Aligning books and movies: Towards story-like visual explanations by watching movies and reading books. Proceedings of the IEEE International Conference on Computer Vision, 19-27.

Обратите внимание, представленные выше научные тексты размещены для ознакомления и получены посредством распознавания оригинальных текстов диссертаций (OCR). В связи с чем, в них могут содержаться ошибки, связанные с несовершенством алгоритмов распознавания. В PDF файлах диссертаций и авторефератов, которые мы доставляем, подобных ошибок нет.