Gadat, Sébastien and Villeneuve, Stéphane (2023) Parsimonious Wasserstein Text-mining. TSE Working Paper, n. 23-1471, Toulouse
Preview |
Text
Download (1MB) | Preview |
Abstract
This document introduces a parsimonious novel method of processing textual data based on the NMF factorization and on supervised clustering withWasserstein barycenter’s to reduce the dimension of the model. This dual treatment of textual data allows for a representation of a text as a probability distribution on the space of profiles which accounts for both uncertainty and semantic interpretability with the Wasserstein distance. The full textual information of a given period is represented as a random probability measure. This opens the door to a statistical inference method that seeks to predict a financial data using the information generated by the texts of a given period.
Item Type: | Monograph (Working Paper) |
---|---|
Language: | English |
Date: | September 2023 |
Place of Publication: | Toulouse |
Uncontrolled Keywords: | Natural Language Processing, Textual Analysis, Wasserstein distance, clustering |
Subjects: | B- ECONOMIE ET FINANCE |
Divisions: | TSE-R (Toulouse) |
Institution: | Université Toulouse Capitole |
Site: | UT1 |
Date Deposited: | 25 Sep 2023 08:45 |
Last Modified: | 04 Nov 2024 12:18 |
OAI Identifier: | oai:tse-fr.eu:128497 |
URI: | https://publications.ut-capitole.fr/id/eprint/48255 |