Translating raw data into insight

By Meghan Chua


Niko Escanilla
Niko Escanilla

Niko Escanilla was drawn from his background in mathematics to graduate study in artificial intelligence and machine learning because he was looking for a discipline that could be applied in real world and clinical settings.

As a graduate student in Computer Sciences, Escanilla had the chance to put those techniques to work as a research assistant on a UW2020-funded project, assessing variables that can predict the risk of breast cancer.

The project brought together an interdisciplinary team of researchers to better understand how demographic and genetic variables interact and affect a person’s risk of breast cancer. This meant the team had a large collection of data to sift through, posing one of the biggest challenges of machine learning. Too large a data set makes it difficult for algorithms to make meaningful connections between data points.

Escanilla worked with his advisor, biostatistics and medical informatics professor David Page, to develop an algorithm that could select only the most important features related to breast cancer risk, narrowing down characteristics such as age, biomarkers, genetic features, and breast density, to help the team get a clearer understanding of the information at hand.

“It turned out that of those that were ranked [most important], the top 30 were all interaction pairs, which tells us that interactions do in fact play an important role in diagnosing breast cancer risk,” Escanilla said.

Working on the algorithm for this project helped Escanilla see the full picture of a machine learning project, in which the end goal is to turn raw data into insight that can be translated into real world impact.

“I was able to see the pipeline much more clearly after doing this – acquiring the data, cleaning it, transforming it, running my analyses and then going back to my collaborators and saying look, here’s what we found,” he said. “It’s really this iterative cycle of building this pipeline.”

Since working on the project, Escanilla has graduated with a master’s in computer sciences. He now works as a data science practitioner, where the skills he developed through research at UW–Madison serve him well.

His coursework also helped him understand the fundamentals of data models, preparing him to judge whether a model is appropriate for a certain application or not.

“That helped me gain an appreciation for the models that I’m using,” Escanilla said. “I’m sort of biased, but I like the computer science program here on campus a lot.”

As a graduate student, Escanilla was also supported by an Advanced Opportunity Fellowship (AOF) and a Computation and Informatics in Biology and Medicine Fellowship.