Published on Mon Aug 09 2021

KnowMore: An Automated Knowledge Discovery Tool for the FAIR SPARC Datasets

Quey, R., Schiefer, M. A., Kiran, A., Patel, B.

KnowMore is an automated knowledge discovery tool for SPARC datasets. It was developed during the 2021 NIH SPARC FAIR Data Codeathon. The tool uses data science methods and Machine Learning algorithms to generate visualizations.

2
1
2
Abstract

This manuscript provides the methods and outcomes of KnowMore, the Grand Prize winning automated knowledge discovery tool developed by our team during the 2021 NIH SPARC FAIR Data Codeathon. The NIH SPARC program generates rich datasets from neuromodulation researches, curated according to the Findable, Accessible, Interoperable, and Reusable (FAIR) SPARC data standards. These datasets are publicly available through the SPARC Data Portal at sparc.science. Currently, the process of simultaneously comparing and analyzing multiple SPARC datasets is tedious because it requires investigating each dataset of interest individually and downloading all of them to conduct cross-analyses. It is crucial to enhance this process to enable rapid discoveries across SPARC datasets. To fill this need, we created KnowMore, a tool integrated into the SPARC Portal that only requires the user to select their datasets of interest to launch an automated discovery process. KnowMore uses several SPARC resources (Pennsieve, o2S2PARC, SciCrunch, protocols.io, Biolucida), data science methods, and Machine Learning algorithms in the back end to generate various visualizations in the front end intended to help the user identify potential similarities, differences, and relations across the datasets. These visualizations can lead to a new discovery, new hypothesis, or simply guide the user to the next logical step in their discovery process. The outcome of this project is a SPARC portal-ready code architecture that helps researchers to use SPARC datasets more efficiently and fully leverages their FAIR characteristics. The tool has been built and documented such that more data analysis methods and visualization items could be easily added. The potential for automated discoveries from SPARC datasets is huge given the unique SPARC data ecosystem promoting FAIR data practices, and KnowMore has only demonstrated a small highlight of what could be achieved to speed up discoveries from SPARC datasets.