Paper details

Title: H-TFIDF: What makes areas specific over time in the massive flow of tweets related to the covid pandemic?

Authors: Rémy Decoupes, Rodrique Kafando, Mathieu Roche, Maguelonne Teisseire

Abstract: Obtained from CrossRef

Abstract. Data produced by social networks may contain weak signals of possible epidemic outbreaks. In this paper, we focus on Twitter data during the waiting period before the appearance of COVID-19 first cases outside China. Among the huge flow of tweets that reflects a global growing concern in all countries, we propose to analyze such data with an adaptation of the TF-IDF measure. It allows the users to extract the discriminant vocabularies used across time and space. The results are then discussed to show how the specific spatio-temporal anchoring of the extracted terms make it possible to follow the crisis dynamics on different scales of time and space.

Codecheck details

Certificate identifier: 2021-009

Codechecker name: Daniel Nüst

Time of codecheck: 2021-06-10 12:00:00

Repository: https://osf.io/rdnyu

Codecheck report: https://doi.org/10.17605/osf.io/rdnyu

Summary:

The authors provide a well documented workflow analysing a large number of Tweets over a considerable time span. Because of the data size, the authors provided instructions for a data subset, for which the code could be executed successfully and the created figures match the provided baseline, and also confirm that data can be created and the code is available for the results reported in the paper.


https://codecheck.org.uk/ | GitHub codecheckers

© Stephen Eglen & Daniel Nüst

Published under CC BY-SA 4.0

DOI of Zenodo Deposit

CODECHECK is a process for independent execution of computations underlying scholarly research articles.