Original article can be found here (source): Artificial Intelligence on Medium
Kaggle & CORD-19
This dataset is available on the Kaggle’s website, which is publicly accessible to any AI researchers through the link below. Researchers in the AI world can’t be unfamiliar with Kaggle — an online community of data scientists and machine learning researchers.
As part of the Google company, Kaggle is best known for organizing various machine learning and data science challenges, including the current one — COVID-19 Open Research Dataset Challenge, or simply CORD-19 Challenge.
The CORD-19 dataset consists over 29,000 articles, among which 13,000 have full text. All of these articles are related to the study of coronavirus, such as case reports, transmission routines, environmental factors, and treatment strategy explorations. However, not all of these articles aren’t machine-readable such that it’s hard to utilize AI tools to extract useful information for us to battle this infectious disease.
Fortunately, researchers from the Allen Institute for AI worked hard and helped AI researchers transform the content of this enormous size of literature into machine-readable form, which makes it possible for data and text mining using a machine learning approach.