GitHub’s use for managing assignments in a programming course has benefits for learners and instructors, but its use is not intuitive
Image courtesy of Author
Version control tools such as Git and GitHub are common tools of the trade for those working in industry software development and data science teams. While not particularly difficult to use, there is a learning curve to getting up to speed with them, particularly when used from the command line as is typical for professional software developers. Reaching a level of comfort with the system can be intimidating for junior developers entering their first industry roles who have not previously worked with it.
Despite its widespread use in industry, GitHub has only recently seen increasing use in academic class settings. A significant driver of the increasing adoption in universities is the education-specific functionality contained in GitHub <a href=”https://classroom.github.com/” target=”_blank” rel=”noreferrer noopener”>Classroom</a>, which was released in 2015. GitHub Classroom enables instructors to manage the distribution and collection of student assignments via GitHub, rather than a university’s traditional Learning Management Systems (LMS). Although GitHub does not offer the full functionality of a LMS, it does provide several advantages over one, most notably the opportunity for students to gain comfort in working with the tool prior to entering their first professional industry role. GitHub claims that as of August 2019 roughly 20,000 teachers were using GitHub Classroom to manage assignments.
In Duke University’s Master of Engineering in AI for Product Innovation program we began utilizing GitHub Classroom this fall in our graduate-level courses. Our first pilot with GitHub Classroom was in our Sourcing Data for Analytics course, a graduate-level course focused on sourcing, managing, cleaning and analyzing real-world data. Much of the course, and the majority of the homework assignments the students completed, was programming-intensive with Python being the primary language used. The programming assignments were in the form of Jupyter Notebooks, with skeleton code generally provided to students to get them started on each assignment.
GitHub Classroom was used to distribute homework assignments to students, collect assignments automatically at each deadline, and then return grades and feedback back to students. An assignment is created by the instructor out of a Jupyter Notebook as a template repository in GitHub, and then can be distributed via the automated setup of private repositories for each student containing the skeleton code of the assignment and instructions. Students can then clone the assignment files to their computer, work on it, and push their updated versions back to their repository as they work.
NBGrader was used for assignment grading, including a combination of autograding and manual grading. To facilitate the automated collection of students’ work on each assignment, a helpful package called abc-classroom was used. Abc-classroom, developed and maintained by University of Colorado’s EarthLab, enables an instructor to automate the collection of student assignments from their GitHub repositories by cloning them to the instructor’s computer, at which point nbgrader can be used to grade their work. Instructors then push the graded assignments back out to the students’ repositories automatically via abc-classroom.
Benefits for students
Image courtesy of Author
Let’s start with the benefits we have found for students from using GitHub Classroom in combination with the above-mentioned tools. The first and most significant benefit was that students were able to gain practice with using version control during their studies, significantly reducing the learning curve for Git / GitHub after graduation when they begin work on an industry team. In our end-of-semester student survey, 82% of students indicated that the use of GitHub in the course was “very valuable”, and the remaining 18% indicated it was “somewhat valuable”. Not a single student felt that the use of GitHub in the course was “not valuable”.
Additionally, the use of Git/GitHub for version control enabled students to easily revert back to older versions of their saved work as needed. Although we did not require its use for the team-based course project, all student teams decided to use GitHub to manage their shared codebase and collaborative development. Finally, the use of GitHub to manage assignments ensured that each student left the course with not only a level of comfort in working with Git/GitHub but also their own GitHub portfolio of work which they could build on in the future.
Benefits for instructors
The use of GitHub had benefits for the course instructors as well. The key benefits we observed over the semester, relative to the default option of using the university LMS for assignment management, were:
- Ease of collecting assignments and automatically organizing them for grading in nbgrader. Thanks to the abc-classroom package, with a couple commands on the command line we were able to download all assignments, organize them for grading and auto-grade them with nbgrader. Our university LMS also allows downloading of all student assignments in one single action, but there is then work required to organize the assignment files into the folder structure needed to run nbgrader on them.
- Simplified distribution of graded assignments with feedback. One major challenge of our LMS system (Sakai) is that each graded assignment must be separately uploaded per student in the course. With larger numbers of students this becomes very time-consuming and tedious. Using GitHub Classroom and the abc-classroom package, with a single command we were able to release graded assignments (including any written feedback in the Jupyter Notebook) back to the students. They could then view the graded version in their assignment repo.
- It facilitated interaction between students & instructors around their work. If students got stuck or had questions on particular parts of their code, it was easy for the instructor to navigate to their current code version in their GitHub repository and review it to provide feedback. Previously, students would have to email their code files to the instructors (or share screenshots), resulting in difficult-to-follow email threads. The simplified, increased interaction benefitted both students and instructors.
Getting started with GitHub for assignment management was not without its challenges. The first and perhaps most significant challenge was that there was no good comprehensive resource for instructors with step-by-step instructors for assignment management using GitHub Classroom and nbgrader. GitHub provides some basic video tutorials on using GitHub Classroom but they are limited to the functionality of GitHub itself and did not cover the other steps of the process which take place outside of GitHub’s walls (e.g. automated assignment collection, grading, and returning of assignments). In order for future faculty members to avoid our own struggles with getting started, we have documented our process in a set of step-by-step instructions for creating, distributing, collecting, grading and returning assignments using GitHub Classroom / abc-classroom / nbgrader.
In addition to the learning curve for instructors, we also had to ensure we efficiently got the students through the learning curve of Git/GitHub prior to using it for the first assignment. To do this, we spent time in the first class session doing a recorded walk-through of how to navigate Git/GitHub, and followed that up with a written step-by-step instruction sheet covering the main functions they would need to perform in order to collect and submit assignments.
Another challenge we faced was managing large datasets which accompany assignments. Small data files could be included together in the template repository with the notebook assignment file and handled directly through GitHub Classroom. However, many of our assignments required students to work with large real-world datasets which exceeded the size limitations of GitHub. As a work-around, we hosted these files on a cloud server and included a few code lines to download and extract them in the skeleton notebook assignment files we provided to students. We then had to ensure we included a gitignore file in the template repository to prevent the students from trying to push the large data files up to GitHub, which resulted in errors.
Despite the initial challenges in working out an effective combination of tools and process, our pilot of GitHub Classroom for assignment management was quite successful. Every student in the course noted on our semester-end survey that they found its use valuable in building their skills. Despite the fact that our students came from various branches of engineering and the vast majority did not have strong programming backgrounds, they found it straightforward to use and we did not have a single incidence of a student failing to submit their work due to a challenge in using GitHub. Now that we have assembled a streamlined process, we intend to expand the use of GitHub Classroom to other courses in our graduate program.