Two students in computer science and software engineering recognized for undergraduate research

Published: Feb 20, 2024 9:00 AM

By Joe McAdory

Two students in computer science and software engineering, Matthew Freestone and Hugh Williams, were recently recognized by the Computer Research Association (CRA) as two of its Outstanding Undergraduate Researchers for 2023-24.

Their deep dive into Artificial Intelligence (AI) and Natural Language Processing (NLP) research was selected as an honorable mention by the CRA, which recognizes undergraduate students in North American colleges and universities who show outstanding research potential in an area of computing research.

“I am thrilled to see two of the most exceptional undergraduate students I’ve had the privilege of collaborating with recognized for their work on the national level,” said Shubra (Santu) Karmaker, assistant professor in computer science and software engineering and director of the Big Data Intelligence (BDI) Lab, where he mentored Freestone and Williams.

“Matthew’s principal research interest revolves around computational linguistics, focusing on the distributed semantics and the application of transformer-based encoding techniques. In the past year, Matthew has come up with a set of original research questions related to Large Language Models’ (LLMs) semantic properties, proposed a concrete plan to test a lot of hypotheses to answer these questions, and successfully executed it.”

More specifically, Freestone focused on the differences between classic word embeddings and large-language-model-based embeddings. The primary objectives include systematically comparing large and classic embeddings and examining their vector representations through quantitative and qualitative analysis. These models, pre-trained on extensive data, create high-dimensional vectors that capture the semantic meaning of words, thereby aiding in tasks such as predicting the next word in a sentence or clustering semantically similar words.

However, these numerical representations, especially in newer models with larger dimensions, are challenging for humans to interpret, contributing to the 'black box' nature of neural language models. As these models are central to any modeling task, understanding their functions and uncovering any inherent biases is crucial. Through this research, Freestone discovered an interesting finding that PaLM and ADA, two LLM-based models, tend to agree with each other and yield the highest performance on word analogy tasks, but they also surprisingly and meaningfully agree with the language model SBERT. This suggests that while PaLM and ADA can capture meaningful semantics and yield high accuracy, SBERT can be an efficient alternative when resources are constrained.

“The challenge lies in comparing embeddings of varying dimensions and the computational expense due to the high-dimensional, large vocabulary nature of these models,” said Freestone, a senior and president of the Auburn Association of Computing Machinery. “With Dr. Karmaker’s guidance, I have submitted a draft of the publication to a computational linguistics anthology. I look forward to working with Santu to address reviewer concerns on that work, and I’m excited to continue doing undergraduate research in the BDI Lab.”

Karmaker considered Williams as, “one of the strongest and most sincere undergraduate student that I have ever worked with and is definitely one of the best students in the CSSE undergraduate program in the College of Engineering.”

“I am hugely impressed by how fast Hugh has progressed in NLP research. He has a great aptitude for quickly learning research paper concepts and thinking critically. Hugh is always curious and courageous in exploring new ideas, which will help him become a great researcher in the future,” Karmaker said.

Williams’ research interest is focused on generative AI evaluation, more specifically, Natural Language Generation (NLG) evaluation. The common practice of evaluating NLG systems involves computing the similarity between a collection of automatically generated documents and their corresponding (human-written) golden reference documents. Unfortunately, existing document similarity metrics are black boxes and, thus, hard to interpret and explain, making robust evaluation of NLG tasks even more challenging.

To address this issue, Williams introduced a new evaluation metric called ExSiM, which provides a vector of scores instead of a single similarity score, where each component of the vector describes a particular property of the similarity metric, thus providing a natural way of explanation. His experimental results demonstrated that the proposed vector can perform comparably to traditional metrics like BERTScore and ROUGE for undirected similarity assessment while providing useful explanations. In addition, ExSiM yields a higher human-machine agreement for directed similarity assessment. This work has also been submitted to ACL 2024.

“I investigated how these situations are best dealt with, and then determined the overall similarities between documents and texts,” said Williams, a junior from Birmingham. “We had to find the semantic overlap between different documents, or texts. In summary, my interest is trying to determine better ways to make AI more explainable.”

Why tackle research at the undergraduate level? Curiosity, Williams said.

“I find it helpful on a personal level, but then of course it can also be very useful career-wise,” said Williams, a junior. “But really, much of it is curiosity. I'm interested in natural language processing using AI to do English tasks — the tasks that feel the most human out of almost any given task. It feels weird for machines to do it, but I have an interest in language, too. It’s very interesting how we communicate.”

Media Contact: Joe McAdory,, 334.844.3447
Hugh Williams, left, and Matthew Freestone research natural language processing.

Hugh Williams, left, and Matthew Freestone research natural language processing.

Recent Headlines