Teaching LLM Assistants in Carpentries Workshops, part 1
Generative AI “assistants” such as ChatGPT and GitHub Copilot, based on large language models (LLMs), have gained a lot of attention over recent years and are impacting how people learn about and practice coding and data science. Some members of The Carpentries community have reported observing learners using ChatGPT and similar tools to assist them in and after workshops. Others have reported on efforts to educate members of their local communities about the models behind these tools, their advantages and limitations, and how to judiciously approach their use.
Simultaneously, initiatives such as ChatGPT in Computing Education have been discussing how such tools might impact the teaching and learning of the kinds of skills taught in Carpentries workshops. Others warn of the dangers of relying on these tools, and raise concerns about the ethical and societal implications of adopting or endorsing their use.
In November 2024, the Curriculum Team hosted two community discussions on the topic of Teaching LLM Assistants in Carpentries Workshops. Around 40 community members gathered across these sessions to discuss how they currently use generative AI assistants in their work, whether and how they teach with/about them already, the potential advantages and disadvantages of including them in Carpentries workshops, and what that might look like if we started to do so.
In this post, we summarise the main points of discussion and the outcomes of those community sessions. A second post will follow next week, describing the next steps that the Curriculum Team plans to take in relation to this topic.
Patterns of usage
Participants at the sessions described a wide range of usage: some had not made much use of LLM assistants but remained interested in the topic and keen observers of the direction of the field, while others had adopted ChatGPT or GitHub Copilot into their daily routine. Many participants described usage somewhere in between these two extremes.
Those who did make use of LLM assistants described a number of different tasks where they regularly found them helpful, including:
- Translating, whether of prose between different human languages (e.g. English -> Spanish), or of existing code into a different programming language.
- Generating “boilerplate” code.
- Rewriting/improving existing scripts to make them more efficient.
- Finding better terms to use as a query in a search engine.
- Explaining unfamiliar functions and syntax.
- Writing documentation strings (“docstrings”) for functions/comparing the current docstring with a function’s actual functionality.
- Describing what a given regular expression will match.
Several of these examples could be relevant skills for learners at Carpentries workshops, as they take their first steps working with software and data.
Teaching the use of LLM assistants
In addition to using these tools personally, many participants in the discussion sessions were already teaching with/about LLMs, whether in Carpentries workshops or other classes where the same or similar skills are taught. Those already teaching the topic described their approach, which typically involved some demonstration of what the tools can do, the mistakes they can make, and some of the hazards of using them. A common theme among participants’ descriptions was a need for “myth-busting”, to expose and correct common misconceptions among learners about how these tools work, their limitations, and the differences between LLM assistants and other tools (e.g. search engines). Describing their motivation to teach LLM assistants, several participants observed that their learners are already making extensive use of the tools in their work, studies, and day-to-day lives. In that context, they felt that it is essential to acknowledge the existence of LLM assistants, discuss their potential uses and hazards, and try to ensure that learners have a good understanding of how to make safe and appropriate use of these tools.
Advantages and disadvantages of teaching LLM assistants in The Carpentries
Participants returned to that latter point when discussing the potential advantages and disadvantages of teaching LLM assistants in Data Carpentry, Library Carpentry, and Software Carpentry workshops: in many cases, learners will already be using these tools, or will be curious about them if not. Although many participants in the discussion sessions had strong reservations about these technologies, there was a general sense that we cannot omit them from workshops altogether. Some pointed to evidence and personal experience suggesting that learners who already have some working mental model of programming are able to use LLMs more effectively to augment their approach, but those who do not have the skills, vocabulary, or confidence are less likely to be able to use the “assistance” offered by such tools. Instead, just as Carpentries workshops aim to instil good (enough) practices in software development and data management, the majority of session participants saw Carpentries workshops as an opportunity for Instructors to teach good practices and help learners develop a better, safer, more useful understanding of LLMs.
Furthermore, some participants observed that usage of LLM assistants is increasingly widespread in the workplace and will likely become an expected skill in the coming years. They felt that we would be doing learners a disservice by not teaching them the skills and knowledge to use these tools as part of their routine approach to software development and data analysis.
However, participants also identified several potential downsides to formally introducing content on LLM assistants into the curriculum for Carpentries workshops. First, that LLM tools may be unavailable or forbidden at some institutions, for example where usage is restricted to a specific platform for data privacy or regulatory reasons. This could make it difficult to develop content and examples that are specific to a particular tool with confidence that they will be usable at workshops in all locations. Similarly, participants highlighted the potentially significant differences in output and performance from paid and free versions of the tools: that all the software we teach is free of cost is one of the major advantages to The Carpentries workshops, ensuring they are accessible to all learners. But here we may find that free versions of LLM assistants are more limited in their potential usefulness to learners. Furthermore, several participants sounded a note of caution around the currently free versions of the most popular tools, citing the danger that their owners may introduce fees in the future that could restrict Instructors’ ability to teach lesson content about them in workshops.
Finally, participants acknowledged that the topic would need a significant investment of time dedicated to teaching and discussing it properly in a workshop – time that cannot then be used for something else. Most participants were Carpentries Instructors, familiar with the challenge of fitting the current lesson content into the time available at a two-day workshop. How would we make space to teach LLM assistants effectively in workshops that are already packed to bursting with other important content?
LLM assistants and The Carpentries Core Values
Central to the discussion in both community sessions was the question of how teaching LLMs in Carpentries workshops might align with or against the community’s core values. Participants highlighted concerns about the lack of attribution of the data used to train the models (conflicting with our commitments to Act openly and Value all contributions), and the significant energy and resource requirements leading to environmental impact of the training process itself.
On the other hand, adapting The Carpentries lessons to teach about LLMs and how to use them effectively would align with our commitment to Empower one another and our eagerness to be Always learning.
Finally, the aforementioned potential difficulties with accessing LLM assistants in different institutions, and concerns about the cost of using these tools in the future, were identified as potentially conflicting with the commitment to championing Access for all.
Support for teaching LLM assistants in Carpentries workshops
While acknowledging these concerns, the predominant view among participants at the community sessions was that The Carpentries should be teaching LLM assistants in workshops, alongside the fundamental concepts and good practices that we already cover. Learners need a working mental model of the domain to be able to interpret answers and debug code suggestions provided by an LLM, and need to be informed about what LLMs are and are not good at, the kinds of tasks they should and should not be used for, and the common difficulties they may face when using them. Support among participants was divided evenly between the suggested options of teaching these tools only as a section within existing lessons, or in more detail in a dedicated lesson/workshop.
Conclusion and trailer for part 2
These were excellent community discussions: another reminder that The Carpentries community has so much valuable experience and expertise, a willingness to share it, and a desire to learn from one another. We are so grateful to all the community members who joined the sessions and participated in the discussions – thank you! We look forward to continuing the conversation in 2025.
In part 2, we will summarise how the Curriculum Team plans to respond to the points raised in these two community discussions. One element of that response will be follow-up discussions, around specific topics within the theme of AI and The Carpentries, in the first few months of the year (follow links for events times in your local time zone):
- Tuesday 28 January, 12:00 UTC and 21:00 UTC: LLMs for Data Science: The Ethics of Teaching LLMs in Carpentries Workshops
- Tuesday 25 February, 12:00 UTC and 21:00 UTC: LLMs for Data Science: Essential Knowledge and Common Misconceptions
- Tuesday 25 March, 12:00 UTC and 21:00 UTC: LLMs for Data Science: Case Studies to Inform Carpentries Curriculum
Sign up to join these discussions on the community sessions Etherpad.