Smart Probabilistic Modeling of Heterogeneous Information and Text Networks

back to events

Smart Probabilistic Modeling of Heterogeneous Information and Text Networks

Apr 11, 2014
Seminar

Speaker:Omiros Metaxas

Date:11/04/2014

University:University of Athens

Room :A56

Time:13:30

Abstract:

We propose TAHINI, an intelligent and scalable probabilistic framework for mining Text Augmented Heterogeneous Information Networks (TA-HINets) that composed of interconnected entities characterized by free text attributes (e.g., papers, web pages), or other Bag of Words (BoW) representations (e.g., user actions) and related side information (e.g., labels, meta-data, tags, images).

At first, we propose an innovative workflow for transforming different data kinds and modalities into multiple interrelated BoW vectors that form a star around one central entity capturing all information spaces.

Then, building upon well established Latent Dirichlet Allocation (LDA), we propose MIX-LDA, a new multi-modal probabilistic topic model for interrelated count data that infers both single (private) and multi-modal (shared) topics, adapting to the extent of correlation between the different modalities and leveraging statistical strength among them. Finally, we present a scalable Gibbs sampling technique for inference and demonstrate the efficiency of the proposed framework on several, real world experiments inferring interesting patterns, groups, similarities and latent interrelationships within and across different data types and modalities.

Smart Probabilistic Modeling of Heterogeneous Information and Text Networks

Members' area

Recent projects

Contact us

Smart Probabilistic Modeling of Heterogeneous Information and Text Networks

Members' area

Recent projects

Tag cloud

Contact us