JISARA

Journal of Information Systems Applied Research and Analytics

Volume 18

V18 N4 Pages 46-55

Dec 2025


Training a large language model to code qualitative research data: Results from discussions of ethical issues


David Simmonds
Auburn University - Montgomery
Montgomery, AL USA

Russell Haines
Appalachian State University
Boone, NC USA

Abstract: Comment coding is an important part of qualitative research, but it is a labor intensive process. Furthermore, researchers need to assess whether or not comments were coded accurately and reliability. Here, we present a process for arranging the original comments and using them to train a Google BERT large language model (LLM) that was able to code comments with 87.9% reliability. This process can be extended by future researchers to potentially code comments made in less-structured research settings, or potentially have the LLM create the comment groupings automatically.

Download this article: JISARA - V18 N4 Page 46.pdf


Recommended Citation: Simmonds, D., Haines, R.P., (2025). Training a large language model to code qualitative research data: Results from discussions of ethical issues. Journal of Information Systems Applied Research and Analytics 18(4) pp 46-55. https://doi.org/10.62273/OTJZ7714