Machine Learning in Radiology. Moscow Lung Cancer Screening Dataset.

Moscow Radiology: Open Access Data Repository for Machine Learning and Artificial Intelligence


Name: Dataset CTLungCa-500 v.3.1, “Tagged Chest Computed Tomography (CT) Images”, version 3.1

Authors: Morosov SP, Kulberg NS, Gombolevskiy VA, Ledikhova NA, Sokolina IA, Vladzimirskiy AV, Bardin AS

Source: Patients aged 50-75 years, for which CT was performed in Moscow (Russia) off-patient clinics according to attending doctor's referral.


The dataset contains 541 CT images of high-risk lung cancer patients and associated radiologist annotations. Each scan was independently inspected by six radiologists paying special attention to lesions with sizes ranging from 3 mm to 30 mm. The radiologists measured the maximum transverse diameter and specified a type for every marked lung nodule. After that, the marks and annotations were reviewed by a new expert.

The main purpose of the dataset is the training of machine learning algorithms in medical diagnostics, and the development of artificial intelligence systems in the healthcare sector.

Terms of use: anyone is free to share (copy, distribute, and transmit) and to remix (adapt and make derivative works) the dataset under the condition that the authors of the original dataset must be given credit.

If you have used MoscowRadiology-CTLungCa-500 dataset in a scientific publication, please cite:

                      S.P. Morosov, N.S. Kulberg, V.A. Gombolevskiy, N.V. Ledikhova, I.A. Sokolina, A.V. Vladzimirskiy, A.S. Bardin. (2018) Tagged results of lung computed tomography scans (RU 2018620500).