Edit Content

Seminaire E-Commerce recense pour vous les différents ateliers marketing digital et événements autour du numérique afin de vous accompagner dans votre formation dans le digital.

Construction dynamique incrémentale de graphes de connaissances par fouille de contenus by Cyrielle Mallart


This thesis presents several works about relation extraction and classification in articles from Ouest-France, the largest newspaper in France. This use-case reveals several challenges around the available data, including a lack of annotated corpora and unbalanced data. The present works therefore discuss two possible ways to apply the performant state-of-the-art to this scenario, while questionning the relevance of state-of-the-art models here. A first approach is the detection of irrelevant entity pairs, to catch them before a classification model, so as to improve the quality of classification by improving the quality of samples to predict ,when the second solution is active learning, where we incrementally feed samples to the model, selecting at each iteration samples to maximize the prediction performance of the relation classification model. Those two approaches improved the performance of simple relation classification models, while the complexity of the state-of-the-art models proves not compatible with the type and amount of data currently available at Ouest-France. Additionally, we quickly explore several options for unsupervised relation extraction, which is not adaptable to our task, or self-supervised representation of relations, which shows enough encouraging results to be explored in the future.

Source: http://www.theses.fr/2022ISAR0018


Leave a Reply

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *

Releated Posts