The main idea of this project is to create an online corpus about the Ukraine war in 2022. To achieve this,
it will be required to:
Extract news from diverse newspapers and comments on social media and convert the format to the most
adequate for all documents in an automated way;
Apply natural language processing techniques to the information extracted in order to proceed with its cleansing,
reducing the amount of information available to what is more important and required for the project;
Store the information extracted in a repository;
Implement a Web platform in which users will be able to search and analyze newspaper articles and comments
extracted from social media about the conflict.