Content area
Security patch documentation is a critical yet time-consuming aspect of secure software development. This thesis investigates the use of generative AI models to automate the generation of SECOM-compliant commit messages directly from code diffs. Two prompting strategies are evaluated: a zero-shot baseline, where messages are generated from raw Git diffs without examples, and a few-shot approach, where structured exemplars guide the model via in-context learning.
In a large-scale benchmark of 500 real-world security patches, zero-shot prompting achieved a SECOMLint structural compliance rate of 33%, surpassing the 6.8% compliance rate of original human-written messages, a relative improvement of +383%. However, metadata accuracy remained low, highlighting the limitations of unguided generation. A targeted evaluation of fewshot prompting across 215 entries drawn from five CWE categories revealed substantial relative gains: format compliance increased by 11.84%, metadata accuracy by 34.08%, and CVSS exact match rate by 177.57%, while severity estimation error (MAE) decreased by 15.72%.
The results demonstrate that few-shot prompting significantly enhances both structural and semantic fidelity, especially for well-represented vulnerability classes. However, performance remains sensitive to data scarcity and semantic complexity, as shown in the underperformance for CWE-918. In addition to contributing a replicable prompting framework and validation pipeline, this work introduces a vision for SECOMLint as a model-assisted documentation tool. The findings highlight the promise and boundaries of using large language models to support structured security documentation and lay the foundation for future tooling that reduces the manual burden of writing security-aware commit messages.