OSF Add-ons Expand to Include Cloud Computing with Boa

January 16th, 2024,
Posted in: OSF, Open Science, Add-ons
Screenshot of Boa on the OSF

A key priority of the OSF is to build toward research communities, making many of their open practices possible and easier. This includes integrating tools that they already use, and OSF can already be extended with a number of third-party software integrations we call “add-ons”. Until now there have been two types of add-ons for OSF: those providing a connection to external storage platforms (Google Drive, Dropbox, etc.) and those connecting to citation tools (Zotero, Mendeley). We are excited to announce that the newest add-on integration to OSF adds a third type: cloud computing.

The newest OSF add-on is with Boa, “a domain-specific language and infrastructure that eases mining software repositories.” Developed by Iowa State University's Laboratory of Software Design, Department of Computer Science, Boa is a data mining tool utilizing its own query language to extract aggregate data from software repositories. Its infrastructure employs distributed computing techniques, enabling efficient query execution across hundreds of thousands of software projects, with applications spanning MSR, COVID-19, and Genomics research. 

From the Boa project website:

Consider answering a question such as "what are the average number of changed files per revision (churn rate) for all projects?" Answering this question ordinarily requires knowledge of (at a minimum): mining project metadata, mining code repository locations, how to access those code repositories, additional filtering code, controller logic, etc.

The Boa language allows a user to put together a single query to answer the question. The OSF add-on allows users to execute these queries over several datasets via an API. The list of datasets available to query through the Boa add-on is found here.

To get started with Boa in OSF, you will first need to create an account at the Boa website for the datasets you wish to query. Once you have a username and password, you can set up the Boa integration in your OSF account settings. Queries are uploaded to OSF as text files and then sent using the Boa add-on. Query responses are automatically downloaded to the same OSF folder as the query text. For more help getting started, see this OSF Help Guide.

The OSF team has been working with the team at Iowa State on this integration for several months as part of a grant project funded by a National Science Foundation grant titled “Collaborative Research: CCRI: ENS: Boa 2.0: Enhancing Infrastructure for Studying Software and its Evolution at a Large Scale”. Dr. Hridesh Rajan, Kingland Professor and Department Chair at the Department of Computer Science at Iowa State University, and Primary Investigator of the grant project is excited about the advancements in open science for software engineering represented by the project. Rajan says:

First, Boa's efficient processing capabilities, when combined with OSF's open data principles, could make large-scale software repository data more accessible and analyzable. Second, the combination of Boa's analysis tools with OSF's open source and open data frameworks can enhance the transparency and reproducibility of software engineering research. Third, OSF's project management tools and API integrations could streamline the research workflow for software engineering projects analyzed using Boa.

Finally, the integration could foster greater collaboration and sharing within the software engineering community, as researchers would have access to a robust set of tools for both data analysis (via Boa) and research management (via OSF). We are looking forward to promoting these new capabilities within the software engineering community.

The new Boa add-on for OSF sets the stage for a more significant development in the future of more types of integrations that will support OSF communities with efficient workflows. Look for more new OSF add-on developments this year.

Recent Posts