Έμβλημα Πολυτεχνείου Κρήτης
Το Πολυτεχνείο Κρήτης στο Facebook  Το Πολυτεχνείο Κρήτης στο Instagram  Το Πολυτεχνείο Κρήτης στο Twitter  Το Πολυτεχνείο Κρήτης στο YouTube   Το Πολυτεχνείο Κρήτης στο Linkedin

Νέα / Ανακοινώσεις / Συζητήσεις

Ανακοίνωση Παρουσίασης Διδακτορικής Διατριβής Ιωσήφ Ηλία Τμήματος ΗΜΜΥ

  • Συντάχθηκε 20-05-2013 08:12 από Eleni Stamataki Πληροφορίες σύνταξης

    Email συντάκτη: estamataki<στο>tuc.gr

    Ενημερώθηκε: -

    Ιδιότητα: σύνταξη/αποχώρηση υπάλληλος.
    Τμήμα Ηλεκτρονικών Μηχανικών και Μηχανικών Υπολογιστών


    ΠΑΡΟΥΣΙΑΣΗ ΔΙΔΑΚΤΟΡΙΚΗΣ ΔΙΑΤΡΙΒΗΣ


    “Network-based Distributional Semantic Models”


    Ιωσήφ Ηλίας



    Τετάρτη 22 Μαΐου 2013, Ώρα 13:00 μ.μ.
    Αμφιθέατρο κτ. Επιστημών, Πολυτεχνειούπολη

    Εξεταστική Επιτροπή:
    Αν. Καθ. Αλέξανδρος Ποταμιάνος, Τμήμα ΗΜΜΥ, Πολυτεχνείου Κρήτης (Επιβλέπων)
    Καθ. Ευριπίδης Πετράκης, Τμήμα ΗΜΜΥ, Πολυτεχνείου Κρήτης
    Δρ. Ευάγγελος Καρκαλέτσης, Διευθυντής Έρευνας I.Π.Τ., ΕΚΕΦΕ “Δημόκριτος”
    Καθ. Μιχαήλ Ζερβάκης, Τμήμα ΗΜΜΥ, Πολυτεχνείου Κρήτης
    Καθ. Γαροφαλάκης Μίνως, Τμήμα ΗΜΜΥ, Πολυτεχνείου Κρήτης
    Αν. Καθ. Μιχαήλ Λαγουδάκης, Τμήμα ΗΜΜΥ, Πολυτεχνείου Κρήτης
    Assoc. Prof. Marco Baroni, CIMeC, University of Trento


    ABSTRACT

    In this thesis, the unsupervised creation of language-agnostic Distributional Semantic Models (DSMs) using web harvested data is investigated for the problem of semantic similarity estimation. Semantic similarity can be regarded as the building block for numerous tasks of Natural Language Processing, e.g., affective text analysis and paraphrasing. The first part of the thesis deals with the construction of typical DSMs following the well-established Vector Space Model. More specifically, corpora are created by harvesting web documents following a query-based approach. Two families of similarity metrics are applied, while related parameters are investigated. Similarity metrics are evaluated against human similarity ratings achieving state-of-the-art results that are comparable with knowledge-based metrics. Despite its good performance, the aforementioned methodology suffers from quadratic query complexity with respect to the size of the lexicon. A methodology of linear query complexity is proposed, which is applied for corpus creation with respect to a lexicon consisting of thousands of nouns. Using this corpus, we propose a novel network-based implementation of DSMs, which is based on the notion of semantic neighborhoods. Semantic neighborhoods are considered as a parsimonious representation of corpus statistics, while they capture two main types of lexical relations: semantic and associative. The problem of the automatic classification of associative and semantic relations is also addressed, motivated by findings from the literature of psycholinguistics and corpus linguistics. Moreover, three novel neighborhood-based similarity metrics are proposed, motivated by the hypotheses of attributional and maximum sense similarity. The proposed metrics are shown to outperform the baseline approaches for the task of semantic similarity estimation between words. Inspired by evidence for cognitive organization of concepts, based on the degree of concreteness, we further investigate the performance and organization of network DSMs for abstract vs. concrete nouns.
    Finally, the framework of network DSMs is extended for the creation of multimodal networks using textual and visual features, and the estimation of semantic similarity beyond word level (noun compounds). Very good results are achieved for both extensions, showing the flexibility of the network-based framework.

© Πολυτεχνείο Κρήτης 2012