Turning Machine Intelligence Against Lung Cancer

This Year’s Challenge

Using a data set of high-resolution image scans of lungs from hundreds of patients provided by the National Cancer Institute, Data Science Bowl participants will develop artificial intelligence algorithms to accurately determine when lesions in the lungs are cancerous. This will dramatically reduce the false positive rate that plagues the current technology.

Competition results have the potential to advance our understanding of how all types of cancer develop and spread in the body. They’ll also free radiologists to spend more time with patients. 

Compete on Kaggle

Breathe in the future. Breathe out the past.

Cancer Moonshot

In the U.S., cancer will strike two in every five people in their lifetimes. But it affects all of us.

That’s why, in 2016, the office of the Vice President announced the Cancer Moonshot. It’s an audacious effort to make a decade’s worth of progress in cancer prevention, diagnosis, and treatment in just five years.

The 2017 Data Science Bowl will pursue one of the Cancer Moonshot’s key goals: unleashing the power of data against this deadly disease. Presented by Booz Allen and Kaggle, the competition will convene the data science and medical communities to develop cancer detection algorithms, and help end the disease as we know it.

The Lung Cancer Detection Challenge

Lung cancer is one of the most common types of cancer, with nearly 225,000 new cases of the disease expected in the U.S. in 2016.

Early detection is critical, as it opens a range of treatment options not available when cancer is detected at later, more advanced stages. Low-dose computed tomography (CT) is a potential breakthrough technology for early detection, with the ability to reduce deaths by 20%.1 Often, suspicious lesions identified in screening are initially assessed as high risk of cancer, but after additional follow-up tests, they turn out to be non-cancerous (false positives from the initial screening).2 Can machine learning reduce the number of radiology exams flagged for potentially unnecessary follow up and avoid patient anxiety?

1Aberle DR, Adams AM, Berg CD, et al.: Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 365 (5): 395-409, 2011.
2Low-Dose CT has historically resulted in high false positive rates of around 25% (Aberle, et. al., New England J Med, 2011, 365:395-409).

The Prize

This year, the Data Science Bowl will award a total prize purse of $1 million, provided by the Laura and John Arnold Foundation, to those who observe the right patterns, ask the right questions, and in turn, create unprecedented impact around this high-priority issue.

Trophy Icon First Place

$500,000

1st Place

Trophy Icon Second Place

$200,000

2nd Place

Trophy Icon Third Place

$100,000

3rd Place

Medal Award

$25,000

4th-10th Place

In addition, $5,000 will be awarded to each of the top three most highly voted Kernels (Total of $15,000) and $10,000 in prizes to be awarded for sharing your Data Science Bowl journey on social media – more details to be announced on February 1, 2017.

Start Your Submission Today

The Data Science Bowl is your opportunity to learn new skills, forge connections with a global community of problem solvers, and be part of something bigger than any one of us: making cancer a thing of the past.

Data sets are available to download beginning January 12, through the end of the competition on April 12. Visit Kaggle.com for more details and begin working on your submission today.