Scientific Papers

Database comments on Telegram channels related to cryptocurrencies with sentiments | BMC Research Notes


This data package has three files. An Excel file contains the opinions of over ten popular Telegram channels about cryptocurrencies. The monitoring of these Telegram channels covers a wide range of cryptocurrencies from December 2023 to March 2024. It was collected through the Telegram API, and the code for extracting these comments is available in the Word package file. After extracting the comments, the operations were performed on them, including equalization, removing stop words, and lemmatization. Then, these data are injected into the HDRB model, described in detail in the research of Kia et al. [8], along with its implementation method. HDRB is a hybrid model based on transfer deep learning that uses the RoBERTa as a backbone and feature extractor and BiGRU deep neural network and attention layer to obtain sentiment polarity and text aspects. This dataset package and Python codes for pre-processing and extracting Telegram comments are listed in Table (1).

Table 1 Overview of data files/data sets

The information of Dataset 1 is (1) text, (2) date, (3) views, (4) scores, (5) compound, and (6) sentiment_type. In the mentioned features, “text” is the preprocessed Telegram comment, “date” column shows the time and date of publication of the comment, “views” shows the number of people’s views of a comment, ” scores” shows the percentage of positive, negative, and neutral polarities. These percentages were obtained with the HDRB model [8], “compound” shows the sum of all polarities in a normalized form between − 1 (most extreme negative) and + 1 (most extreme positive), and ” sentiment_type” It shows the type of tweet polarity (positive, negative, or neutral). Researchers can easily change the number of polarities by using compound values—for example, strongly positive, positive, neutral, negative, and strongly negative.



Source link