We use a lot of open source software. The Linux operating systems on (some of) our computers, the text editors we use to write software (e.g. VS Code, vim), the programming languages we use (e.g. Go, .NET, PHP, Python, Java) the libraries and utilities we use and the products built themselves: Web servers, databases and other tools.
I’m not sure it’s easy to create software using only proprietary tools any more. Even if you used Visual Studio on Windows, a lot of .NET code is open source, so you’d probably end up using an open source library.
It’s easy to take open source for granted. As software engineers, we’re now used to being able to look at the code of a product or tool we’re using to understand what it’s doing and even be able to propose changes going on, but it wasn’t always that way.
Successful open source projects require effort from the maintainers who provide their time (often for free) and from the wider community in the form of testing, issues, pull requests and promotion.
I decided to have a look at what the people at Infinity Works had been contributing in 2018. To make it easier for me, I limited it to Github and I tried to remove our “day jobs” (the work we get paid to do for our clients). Even though we may do open source work for clients, I was interested in other contributions.
There’s 139 people in our Github organisation. Here’s what I found.
We created 71 issues. An issue can be a bug report, a feature request or even a request for help. The number surprised me because it’s so low, but it’s probably accurate.
People use issues in lots of ways. Some raise issues within repositories now owned by them, but others are using them as placeholders for features, e.g. raising an issue in one of their repos to remind them to do something.
The top issue creator was
wojtek-oledzki (https://github.com/wojtek-oledzki). It looks like he’s working on a project called https://github.com/sansible which is about using Ansible to setup a lot of modern infrastructure like Kafka and is using issues to track work.
YOU54F (https://github.com/YOU54F) is creating a lot of issues and fixes around Pact (consumer-driven contract testing).
dvejmz (https://github.com/dvejmz) is finding a lot of issues and missing files and
AdeOpe (https://github.com/AdeOpe) has been sharing his scripts for getting OAuth up and running with Cognito.
I guess overall it’s good news, it means that most people probably didn’t have any issues with open source software that they needed help with via Github, since the top 10 people raised 50 of the 71 issues.
Pull Requests are the next level of engagment. It’s a contribution of changes to software. They’re tricky.
Sometimes, your pull request gets merged, sometimes it doesn’t. It depends on lots of factors:
- Whether your change is well written / matches the project style.
- Whether the maintainers want your change / it matches the project goals.
- Whether the maintainers are busy or don’t care about the project any more.
I’ve been on both sides here. I spent a long time on a pull request to a text editor adding autocomplete (a feature I really wanted – https://github.com/zyedidia/micro/pull/977), but the maintainer never found the time to merge it. Also, I feel a bit guilty about a pull request in a project I started (a JSON Schema to Go generator).
I wrote it because I needed it for a project, I’m not a fan of JSON schema or anything. It’s a big change and I’m happy if the users of the tool want to merge it, but I’m not really an expert in JSON schema, and don’t particularly want to spend my leisure time thinking about it, so it’s been sat there for a while until I have a rainy day to think about it.
So… how many pull requests have we raised but are sat there unmerged?
I’m sad to say that it’s 54 pull requests just withering away on branches. The people who wasted the most time on unloved code might be
JamesSheard (https://github.com/JamesSheard) and
MikeSilverstone (https://github.com/MikeSilverstone) with six pull requests each. The good news is that it seems to be part of some training they were doing, so hopefully they learned something.
Apart from that, we had 161 merged, they’re pretty diverse, but from what I saw, mostly small changes to various libraries and Websites.
Quantity of repositories
So who has open sourced lots of personal projects?
cindyjialiu (https://github.com/cindyjialiu) is in the clear lead here, having done a lot of proof-of-concept and learning work across a wide range of programming languages in the last year.
I’m next, in the last year I’ve made a number of small utilities and libraries for Go:
- Go: Dates which support custom JSON formatting – https://github.com/a-h/date
- Go: ordered maps / sets – https://github.com/a-h/setof and https://github.com/a-h/mapof
- Some Serverless security tools for analysing Github repos and keeping up-to-date with the NIST National Vulnerability Database – https://github.com/a-h/nvdnotifier and https://github.com/a-h/watchman
- A command-line file search tool (a bit like grep) – https://github.com/a-h/search
- Some machine learning algorithms – https://github.com/a-h/ml
- Go: middleware to hide content behind Google domain authentication (to have a website private to a Google org) – https://github.com/a-h/gauthmiddleware
- Go: a library to extract path paramaters from URLs to prevent you from needing to reference your HTTP router in your HTTP handlers (having to reference your router couples your handlers to your router too tightly for me) – https://github.com/a-h/pathvars
- Go: a cache designed for Serverless applications that uses Kinesis to update multiple nodes as data changes – https://github.com/a-h/scache/
- Go: tools for writing parsers – https://github.com/a-h/lexical
- An app for tracking your phone’s mobile connectivity on train journeys (the GPS resolution wasn’t really good enough) – https://github.com/a-h/Connect
benbrunton‘s (https://github.com/benbrunton) been doing a bit of rust and some socket.io stuff and
steinfletcher (https://github.com/steinfletcher) looks to be doing something interesting with sequence diagrams and Go.
All together, we’ve updated 280 of our personal repos in the last year.
OK, the popularity contest. I’ve counted stars based on only counting stars from repos updated in the last year. It’s stars for people’s personal repos only.
Next most popular is
JamesBarwell (https://github.com/JamesBarwell) with repos with 467 stars updated this year. His Node.JS library for working with Raspberry Pi (https://www.raspberrypi.org/) GPIO ports (https://github.com/JamesBarwell/rpi-gpio.js) is really popular.
I’m next most popular for the JSON Schema to Go struct thing (which is due a rewrite to use a neat Go JSON schema library I found).
I can’t talk about community without mentioning the real-world meetups that Infinity Works people help out with. We run or sponsor community events, not because we have to, but because we enjoy it.
We run a couple in Manchester that I really like.
Manchester: DevOps Battle Royale (https://www.meetup.com/The-DevOps-Battle-Royale/) – short talks followed by group discussions, a technological victor must emerge.
Manchester: Infinity Works 101 sessions (https://www.meetup.com/Infinity-Works-101-Sessions/) – free training in cutting-edge tech.
I also like CodeUp. I tend to go to the one in Leeds, which is sponsored by Infinity Works (https://www.meetup.com/CodeUp-Leeds/) but others in the team go to the Manchester one (sponsored by Code Computer Love) (https://www.meetup.com/CodeUpManchester/).
I used to love a bit of Leeds DevOps (sponsored by Infinity Works) even though I haven’t been to one since I’ve been working in Manchester and I’ve been to one GoLang Manchester (https://www.meetup.com/golang-mcr/) but it always seems like it clashes with something else for me.
I think it’s a great thing that people want to spend their time to learn from others and share what they know.
The community at Infinity Works also stumped up £300 last year to donate to an open source project out of their own pockets (which was then doubled to £600 by Infinity Works).
How did I find this out?
I wrote some code in Go at https://github.com/a-h/openanalysis which uses the Github GraphQL API (https://developer.github.com/v4/) to collate the data. If you’re interested in using the Github API, it’s probably a good script to look at anyway.
You’ll need to pass it a Github API token if you want to run it yourself (https://help.github.com/articles/creating-a-personal-access-token-for-the-command-line/).
The first part of it creates a JSON file containing the data for each Github user in the
users.json file, while the
sum script creates the summary data.
So feel free to run it against yourself and I supposed you should submit a pull request if you think the tool could be improved or drop me a star so I feature more highly next year.