This page describes a variety of small(ish) computer science projects that would be useful for me if someone were to complete them. You can consider them a challenge or a push in a direction to try to learn something new.

If you are an undergraduate at GW, then I encourage you to pick one that interests you and try to solve it. Doing so will win you fame and glory... your name on this website! Sadly these bounties do not come with a monetary or "extra credit" reward (so you should probably do your homework first if you are currently in my class).

How this works

Below you will find short descriptions of code which would be useful to me. They are marked with a difficulty (could be totally off) and suggested language (generally this is flexible).

If one of the projects interests you, I suggest you create an Issue in the Bounties GitHub repository to say you are interested in solving it. If you have any questions about what is expected, post them there (i.e., don't email me). Learn more about GitHub issues here.

To begin working on the project you should make a fork of my Bounties repository. This will create a copy of it within your GitHub account. You can then clone your fork to work on the code locally. My suggestion is to do all of your work inside a branch in your repository, keeping the master branch in sync with mine. You may need to learn more about git forks, clones, and branches. Feel free to work together with other students.

Once you have completed your coding, you should create a Pull Request to notify me that you would like to merge a solution into the repository. The guide linked above should give some idea how to do that.

If anything is unclear, post an issue asking for help! If you have suggestions for new Bounties you can post there too!


Pretty Graphing Scripts

Difficulty: Easy --- Languages: Python

Background: In our research we constantly need to make graphs to include in our papers. Graphs made in tools like Excel tend to look cookie-cutter, and don't always format well for academic papers. Traditionally we have used the tool gnuplot to make graphs, but its scripts are fairly archaic and perhaps a bit outdated.

The Bounty: Python has some excellent graphing libraries which should be able to make us nicer graphs that are easier to customize. Matplotlib is one place to start, or better yet, look at a library that builds on it like Seaborn or Pandas.

Your program(s) should:

  • Read input data files formatted in comma or tab separated format (see examples below). It is likely python has an API to make this easy (i.e., don't use file IO to read a file line by line).
  • Generate graphs for two common types of data analysis:
  • Make it easy to control colors, font sizes, line widths/styles/etc by editing your script. The defaults should be attractive and designed for either displaying on the screen as a PNG or printing out as a PDF (which will typically require much larger fonts).
  • Your goal is not to make a universal program that takes a file name and outputs a graph, but rather a set of helper scripts that a user can easily edit to meet their own needs. This is because in practice we often need to make lots of small changes such as setting X/Y labels, enabling log scale, etc.

When you are done, make a PR adding to this directory.

Script for CDFs and Histograms

Difficulty: Easy --- Languages: Python

Background: Cumulative Distribution Functions (CDFs) and Histograms are great ways to provide information about the likelihood of certain results. A CDF is similar to a histogram, but here is a nice article about why CDFs (actually I really mean Empirical CDFs) are even better. I commonly need to plot histograms (e.g., to show a grade distribution in a class) or CDFs (e.g., to compare the performance of different techniques in our research), yet sadly this is not directly supported by spreadsheet tools like Excel.

The Bounty: This is a variant of the Pretty Graphs project above, but it is more constrained, so it could be easier. Your task is to write an easy to use script that can generate a histogram or CDF from a data set. Your program should take a data file as input and output a nice graph. This should be simple with a library like Seaborn or Pandas.

  • Plot a histogram from a data file with an easy way to specify the min/max and bin size at the command line. Ideally your program should pick reasonable defaults for these.
  • Plot one or more CDF curves from a data file. This should follow the notes in the Pretty Graphs bounty for making the graphs attractive.
  • Sample files are provided here.

When you are done, make a PR adding to this directory.

Code Editor Stats

Difficulty: Medium --- Languages: Javascript?

Background: I love seeing metrics about what I do every day. A few years ago I made an extension for the Atom editor that added an odometer counting how many letters you typed, arrow keys you pressed, etc. It's a pretty useless thing, but it was fun to build and use. Here's an example of how it looked:

atometer

Unfortunately I don't really use Atom anymore, and my prior implementation (available here) didn't work quite right anyway if you had multiple windows open.

The Bounty: Your task is to make a new version of this extension that works with Sublime and/or Visual Studio Code (since those are the two main editors I use). Or if you are an Atom user, you can revive my original project and extend it. Apparently 600+ people have downloaded it, so there is at least some demand.

If you are interested, make an Issue.

Slack Participation Bot

Difficulty: Hard --- Languages: Javascript/Node.js/Databases

Background: I use Slack in many of my classes to run student discussions. In a normal classroom environment I like to track which students are actively participating, but this is harder to do on Slack (at least with the free version that doesn't include analytics). However, Slack allows you to write addons and bots which can interact with the platform in various ways.

The Bounty: Write a Slack Bot or extension which monitors a set of Slack channels and maintains statistics about how often each user writes a message. This information should then be available to each user (privately) or to the instructor (to see the full class).

  • Your slack bot will actually be a web application, most likely written Javascript and running on the Node.js platform.
  • Slack has an API that allows you to register a Callback whenever an event of interests happens. The callback is actually just a URL for your web application, everytime it is called your application will be passed some information about the event.
  • You should be able to use the Slack Message API to be notified for every message in a channel.
  • Slack has a Getting Started Guide for its Bolt API, but this tutorial might be even easier, and it shows how to start building your Bot on Glitch, a platform that makes web development even easier by managing the node.js runtime for you automatically!

A fully featured SlackStatsBot could do the following:

  • Observe one or more channels for new messages
  • Count how many messages each user writes in total and broken down by channel
  • Generate a report in CSV format similar to this example (click Raw to see the unformatted file) that shows how many messages each user sent. An admin could request this report with a command like /stats CHANNEL START_DATE END_DATE
  • Generate a full report in CSV format similar to this example which reports for every user how many messages they sent across all watched channels per week.
  • Anyone should be able to message the bot and get back output specifying that user's stats (i.e., one row from the full report). Students should not be able to see the full stats or the stats for other users.

This is a much more substantial project than the ones listed above, so I suggest you just pick a small part of this to work on. Once you have anything working--even just a simple bot that gets notifications of new messages--post an issue to get feedback and start a discussion with me about your plans.