Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published July 2021 | public
Book Section - Chapter

COVID19 Tweeter Dataset Sentiment Analysis

Abstract

COVID19 (define as 'CO' stands for corona, 'VI' for virus, and 'D' for disease) is declared global pandemic by WHO. In starting of year 2020 it was limited with China but now More than 206 countries is affected due to this COVID-19 and more than 3.5 billion people infected on the globe and out of that more than 1 million people died due to this incurable disease. WHO did not approved any vaccine till current date. All people around the globe effected due to COVID19 and they wrote their view on social media mainly in Twitter. In span of last 9 month of time hundreds of billon text is written on twitter. Sentiment Analysis is natural language processing (NLP) application which is used to categories text sentiment as positive view, negative view or neutral. Different machine learning algorithms is used to extract sentiment from the text but those ML algorithms require text in specific. But that is major step in whole process of sentiment analysis because the data available at tweeter is available in raw form which required a lot of preprocessing and cleaning before using for sentiment analysis.In this article tweeter data related to COVID19 is discussed in detail like that what are different ways to use tweeter data for sentiment. What are different difficulties, what are different steps in tweeter data preprocessing, and finally ready form of dataset. Python is used as a programming language for sentiment analysis in this article. Same it is also used for data cleaning & preprocessing. Different python libraries which are used for data preprocessing also discussed.

Additional Information

© 2021 IEEE.

Additional details

Created:
August 20, 2023
Modified:
October 20, 2023