Anika Agarwal
Course: Computer Science 1, CMSC-201
Professor: Justin Frock
Assignment Title: A Polar Science Chatbot
Applications
A chatbot is an interface that mirrors human conversation through machine learning and artificially intelligent large language models.
The purpose of a chatbot is to provide instant answers. Because of their immersive and engaging responses, chatbots are optimal for convenience andefficiency (Benefits of chatbots).
Environmental Applications:
A chatbot is an effective way to convey research. This chatbot specializes in icesheets, specifically, important informationin layers of ice sheets about the Earth’s past geological, climate, and atmospheric conditions. This information is crucial inproviding context about the Earth’s landscapes from millions of years ago and predicting future patterns.
Methods
Data Collection & Training: More data used to train Artificial Intelligence and Machine Learning leads to more accurate responses and less biased predictions (IBM, What is a pre-trainedmodel). The amount of data used to train the model scales up with the model’s size and computation resources. (Samborska). Training size continues to grow “on research papers, books, videos, articles”.
Fine-tuning: Specialization of the model is usually done through finetuning (ex. finance, environment). This is useful as the LLM is tailored to a specific industry, so it understands jargon and nuance, providing “high-quality outputs that are more relevant and precise.” (IBM, Domain). This chatbot is grounded to understanding ice sheets in the polar regions of Greenland and Antarctica.
Challenges and Successes
This chatbot has been developed across platforms, each with unique problems. First, there were not enough CPU and TPU’s to run the chatbot (Computer Processing Unit).
Coming into this project, I had only taken basic python classes and had little knowledge on how to program a chatbot from scratch. Nevertheless, I created a “code analysis” that breaks down each section of code explain the details of each line.
Currently, I am working of refining the quality of answers by training the chatbot by adjusting the finetuning parameters andcreating thoughtful questions for my query-response set. The quality of the chatbot’s responses are not yet suitable for efficient public use, but by refining the model, it will be applicable for wider use b yindividuals hoping to learn more about the climate crises in polar regions, and the significance.
Results
This uses a pre-trained chatbot, from Google’s Gemma line, Gemma2B (Google DeepMind). The model is finetuned and trained on a question set. This prompts the user to input a prompt, then the chatbot generates a response.
The results before optimization are poorer in quality than after fine-tuning. Instead of retraining 2 billion parameters, the original model is frozen, and small, trainable layers, are added.
Conclusions
Next steps for this project include adding targeted research papers into the data for training to make the chatbot specific to University of Maryland Baltimore County’s NSF research project.
The chatbot will be well suited to answer questions about the research.
Then, the next step will be developing a front end for the chatbot. This component aims to make the chatbot more user-friendly.