MSR4SBOM

Mining Software Repositories for Enhanced Software Bills of Materials


MSR4SBOM is a project that aims to deliver a framework that analyzes the content of software repositories and Software Bill of Materials (SBOMs) to provide context-sensitive recommendations.

Find Out More

MOTIVATIONS


A Software Bill of Materials (SBOM) describes, in a structured and machine-readable format, the open-source and proprietary components that constitute a software product, including their licenses, versions, vendors, vulnerabilities, and dependency relationships. SBOMs enable practitioners to gain visibility into the software supply chain and monitor any risks associated with software security, licensing, and more.

Public administrations have been promoting SBOMs to have secure and accountable software products. For example, in 2021, the United States Government, per President Biden’s Executive Order 14028, laid down that any organization releasing software products to federal agencies must provide SBOMs for the released software products. Moreover, in 2022, the European Commission proposed a cybersecurity regulation (namely, Cyber Resilience Act) outlining that software producers must document the vulnerabilities and components of their software products with SBOMs. As a result, SBOMs are expected to shortly become the de-facto standard for any organization that develops or maintains software products in both industrial and open-source contexts.

{
  "$schema": "http://cyclonedx.org/schema/bom-1.6.schema.json",
  "bomFormat": "CycloneDX",
  "specVersion": "1.6",
  "serialNumber": "urn:uuid:3e671687-395b-41f5-a30f-a58921a69b79",
  "version": 1,
  "components": [
    {
      "type": "library",
      "group": "org.example",
      "name": "mylibrary",
      "version": "1.0.0",
      "cpe": "cpe:/a:example:mylibrary:1.0.0",
      "purl": "pkg:maven/org.example/mylibrary@1.0.0",
      "externalReferences": [
        {
          "type": "advisories",
          "url": "https://example.org/security/advisories.json"
        }
      ]
    }
  ]
}
                

Example illustrating a component (specifically, a library) in an SBOM, according to the CycloneDX standard and the JSON format (source).

CHALLENGES


MSR4SBOM aims to exploit the long-lasting experience of the team in the diverse areas of Software Engineering to develop innovative solutions for creating and validating enhanced SBOMs. Specifically, MSR4SBOM will address the following challenges:

Granularity problem

Dealing with software supply chains requires handling and integrating information and dependencies from components at different levels of granularity (e.g., libraries, or code snippets).

Dependencies complexity

The set of dependencies that SBOMs need to consider is complex and originates from different sources (e.g., software/hardware configurations, static/dynamic linking, services, or files).

Lack of processes for SBOMs

The high heterogeneity of components requires customized processes to produce and consume SBOMs. Limited reuse capabilities of SBOMs could hinder managing licensing/security concerns.

OBJECTIVES


MSR4SBOM aims to address the following four objectives:

State-of-the-practice

Understanding how extensively SBOMs are used by industrial and open-source practitioners, along with the associated challenges and needs.

Enhanced SBOMs

Developing approaches and tools to create enhanced and fine-grained SBOMs, with detailed information on licensing/security concerns.

SBOM infrastructures

Developing recommender systems to notify developers when events related to the components of enhanced SBOMs occur.

Assessment

Conducting in-field empirically evaluation of the proposed approaches and tools to promote their adoption by industrial and open-source practitioners.

TEAM



PUBLICATIONS


  1. Sabato Nocera, Massimiliano Di Penta, Simone Romano, Fatima Ahmed, and Giuseppe Scanniello «What We Know about AIBOMs: Results from a Multivocal Literature Review on Artificial Intelligence Bill of Materials» ACM Transactions on Software Engineering and Methodology, 2026. https://doi.org/10.1145/3786773

  2. Sabato Nocera, Sira Vegas, Giuseppe Scanniello, Massimiliano Di Penta, and Natalia Juristo «Causal or Correlational? A Cohort Study on the Effects of Code Smells on Class Change- and Fault-Proneness» Proceedings of the International Conference on Software Engineering, 2026.

  3. Sabato Nocera, Simone Romano, Massimiliano Di Penta, Rita Francese, and Giuseppe Scanniello «On the adoption of software bill of materials in open-source software projects» Journal of Systems and Software, 230, 112540, 2025. https://doi.org/10.1016/j.jss.2025.112540

  4. Daniele Bifolco, Simone Romano, Sabato Nocera, Rita Francese, Giuseppe Scanniello, and Massimiliano Di Penta «An empirical study on the accuracy of GitHub's dependency graph and the nature of its inaccuracy» Information and Software Technology, 187, 107854, 2025. https://doi.org/10.1016/j.infsof.2025.107854

  5. Pietro Cassieri, Simone Romano, and Giuseppe Scanniello «A Mining-Software-Repository study on deprecated API usages in open-source Java software applications» Information and Software Technology, 186, 107782, 2025. https://doi.org/10.1016/j.infsof.2025.107782

  6. Sabato Nocera, Davide Fucci, and Giuseppe Scanniello «Dealing with SonarQube Cloud: Initial Results from a Mining Software Repository Study» Proceedings of the International Symposium on Empirical Software Engineering and Measurement (ESEM), 2025, pp. 372-378.

  7. Sabato Nocera, Sira Vegas, Giuseppe Scanniello, and Natalia Juristo «Causal Inference Needs More Than Analysis: The Role of Study Design» Proceedings of the SIGSOFT FSE Companion, 2025, pp. 1428–1431. https://doi.org/10.1145/3696630.3731619

  8. Davide Fucci, Massimiliano Di Penta, Simone Romano, and Giuseppe Scanniello «Augmenting Software Bills of Materials with Software Vulnerability Description: A Preliminary Study on GitHub» Proceedings of the SIGSOFT FSE Companion, 2025, pp. 631–635. https://doi.org/10.1145/3696630.3728513

  9. Riccardo D'Avino, Sabato Nocera, Daniele Bifolco, Federica Pepe, Massimiliano Di Penta, and Giuseppe Scanniello «ALOHA: A(IBoM) tooL generatOr for Hugging fAce» Proceedings of the International Conference on Evaluation and Assessment in Software Engineering (EASE), 2025, pp. 929–937. https://doi.org/10.1145/3756681.3756998

  10. Marcel Böhme, Eric Bodden, Tevfik Bultan, Cristian Cadar, Yang Liu, and Giuseppe Scanniello «Software Security Analysis in 2030 and Beyond: A Research Roadmap» ACM Transactions on Software Engineering and Methodology, 34(5), Article 144, 2025. https://doi.org/10.1145/3708533

  11. Daniele Bifolco, Pietro Cassieri, Giuseppe Scanniello, Massimiliano Di Penta, and Fiorella Zampetti «Do LLMs Provide Links to Code Similar to What They Generate? A Study with Gemini and Bing CoPilot» Proceedings of the Mining Software Repositories Conference (MSR), 2025, pp. 223–235. https://doi.org/10.1109/MSR66628.2025.00042

  12. Sabato Nocera, Sira Vegas, Giuseppe Scanniello, and Natalia Juristo «Software Composition Analysis and Supply Chain Security in Apache Projects: an Empirical Study» Proceedings of the Mining Software Repositories Conference (MSR), 2025, pp. 103–115. https://doi.org/10.1109/MSR66628.2025.00027

  13. Daniele Bifolco, Sabato Nocera, Simone Romano, Massimiliano Di Penta, Rita Francese, and Giuseppe Scanniello «On the Accuracy of GitHub's Dependency Graph» Proceedings of the International Conference on Evaluation and Assessment in Software Engineering (EASE), 2024, pp. 242–251. https://doi.org/10.1145/3661167.3661175

  14. Simone Romano, Giovanni Toriello, Pietro Cassieri, Rita Francese, and Giuseppe Scanniello «A Folklore Confirmation on the Removal of Dead Code» Proceedings of the International Conference on Evaluation and Assessment in Software Engineering (EASE), 2024, pp. 333–338. https://doi.org/10.1145/3661167.3661188

  15. Giuseppe Scanniello, Massimiliano Di Penta, Simone Romano, Rita Francese, Sabato Nocera, Pietro Cassieri, Daniele Bifolco, and Fiorella Zampetti «MSR4SBOM: Mining Software Repositories for enhanced Software Bills of Materials» Proceedings of the International Symposium on Empirical Software Engineering and Measurement (ESEM), ACM, 2024, pp. 589–593. https://doi.org/10.1145/3674805.3695390

  16. Sabato Nocera, Massimiliano Di Penta, Rita Francese, Simone Romano, and Giuseppe Scanniello «If it's not SBOM, then what? How Italian Practitioners Manage the Software Supply Chain» Proceedings of the International Conference on Software Maintenance and Evolution (ICSME), 2024, pp. 730–740. https://doi.org/10.1109/ICSME58944.2024.00077

  17. Pietro Cassieri, Simone Romano, and Giuseppe Scanniello «On Deprecated API Usages: An Exploratory Study of Top-Starred Projects on GitHub» Proceedings of the International Conference on Product-Focused Software Process Improvement (PROFES), 2023, pp. 415–431. https://doi.org/10.1007/978-3-031-49266-2_29