Leveraging AI to investigate the impact of different research funding programs on research outcome

Funding plays a crucial role in determining the success of research endeavors. Thus, it is essential to evaluate the influence of funding on the scientific outcome of researchers. In this study, we aim to examine the impact of various funding programs provided by the Natural Sciences and Engineering Research Council (NSERC) and compare the impact and productivity of the funded researchers. We first conduct a descriptive analysis to understand the trends in different funding programs offered by NSERC, including changes in funding amounts and allocation across provinces, universities, and other relevant factors over time. Next, we utilize statistical models and machine learning algorithms trained on the integrated database of researchers’ publications and fund ing to determine the efficacy of different NSERC funding programs. The main objective is to gain insights into the impact of funding on the scientific output of researchers. The study is ongoing, and we are now at the model-building stage.


Introduction
Research funding refers to the financial support provided to research teams and/or individuals for their scientific pursuits, following the submission and approval of a research proposal by a relevant funding agency (Huang & Huang, 2018).Funding is a crucial aspect of scientific activities that can significantly impact research productivity.With additional funding, researchers might be able to increase their number of publications, improve the quality of their work, and ultimately generate a higher impact.Furthermore, funding enables researchers to collaborate more effectively, increasing the quantity and potentially improving the quality of their published papers (Ebadi & Schiffauerova, 2015).As a result, evaluating the influence of funding on scientific output and seeking more effective methods for monitoring and evaluating funding allocation are of paramount importance.
In this research study, we have focused on researchers who are supported by the Natural Sciences and Engineering Research Council (NSERC) which is the main federal funding organization in Canada supporting researchers who conduct research in natural sciences and engineering.NSERC offers a diverse range of funding programs, each carefully crafted to encourage and support researchers in achieving specific goals.These programs are aimed at promoting different objectives such as innovation, fostering collaboration among the research community, and facilitating partnerships between academia and industry (Veletanlic & Sa, 2020).Our focus is centered on the NSERC-funded researchers, and we aim to investigate the effect of various funding programs on the outcomes of their research in terms of productivity, quality/impact, and scientific collaboration.By analyzing the impact of different programs, our goal is to shed light on the effectiveness of NSERC's programs in supporting researchers and identifying areas for improvement.

Data and Methodology
Three different data sources are used in this study.First, we collect information about funded researchers in natural sciences and engineering who were supported by NSERC within the period from 1982 to 2018.The funding dataset includes (but is not limited to) metadata such as the name of the researcher who received funding, funding program, year of the financial support and duration, researcher's affiliation, and amount of funding.We selected the main programs that have been active for various years and supported a considerable number of researchers over time.The selected programs are: -"Discovery grants program -individuals": This is designed to support researchers and enhance their innovation with a long-term perspective.
-"Collaborative research and development grants": This is to support the collaboration between Canadian universities and industry in the private or public sector.-"Strategic projects -group": It supports research in specific areas of interest that would have a strong positive impact in the coming 10 years for Canada.-"Canada research chairs": This targets to foster research excellence by supporting world-class researchers.Next, we collect publications of funded researchers who received funding from the abovementioned programs from Elsevier's Scopus.Scopus was selected due to its completeness and accuracy compared to other databases such as Google Scholar and Web of Science (Tahmooresnejad, et al., 2015).The publications database contains (but is not limited to) metadata such as the researcher's name, the title of publication, the date of publication, the journal name, the author's affiliation, and funding information.The last database is the journal impact factor which is collected from SCImago.Using this database, we can compare the quality/impact of the journals in which the articles have been published.
As of date, we have collected the data, intensively cleaned, and pre-processed the collected data, and integrated all three databases.The integration phase required us to perform entity disambiguation in order to be able to link funding and publication data using the name of the funding recipients and authors' names in the Scopus database.Similarly, we have linked the journal impact factor to the articles in the Scopus database by matching the name of the journal in two databases.To conduct a focused analysis, we have selected a subset of variables to include in our model.We have only included the variables that are relevant to our analysis and have left out any extraneous variables.Our model comprises several numeric variables which have all been scaled (except the dependent variable) and a categorical variable that represents the funding program.
Having collected and integrated the data, we plan to perform descriptive exploratory analyses to study different funding programs' temporal trends and to understand how they have changed over time in terms of the amount and share of allocation to different provinces, universities, etc. Next, we will employ statistical analysis and machine learning techniques to investigate the interrelationships between funding programs and scientific performance and estimate the effectiveness of different funding programs.We will build baseline statistical models on the integrated dataset and will compare their results with a select set of machine learning models, e.g., artificial neural networks, support vector machines, and random forests.The goal is to capture the long-term and short-term impact of funding on the productivity, quality, and collaboration of researchers.
We have four dependent variables.To measure the research productivity of the NSERCsupported researchers, we use the number of publications as a proxy (Godin, 2003;Ebadi & Schiffauerova, 2013).Furthermore, to assess the impact of the researchers' work, we consider two key metrics: 1) the number of citations of their papers ( Ebadi & Schiffauerova, 2013).By considering these metrics as dependent variables, we aim to provide a comprehensive evaluation of the impact of NSERC funding programs on research outcomes, covering different aspects of scientific activities.