This is a nice common
We started this week asking “What would it take to make a list of all representatives in the UK?”.
To answer this question, or at least create more questions we gave ourselves a week’s “spike”. The idea was to get to the end of the week more informed and able to make some decisions about whether we wanted to do anything else on the project to create a list of all elected representatives in the UK.
Over the week we explored some of the technical tasks involved. These tasks look a lot like what we have to do for our polling station or candidates work. They can be grouped in to three categories:
We expect this is more or less the same for any work that looks a bit like “aggregate stuff that every council does in to one place”. Finding patterns for doing this well in one area might end up being useful to other areas, like planning alerts, meetings agendas, notices, spending, contracts and so on.
“How” is quite well thought through, but we admit that we’ve not spent enough time on researching “why”.
On one hand “why” can be answered by stating that this information simply should be open for all. We know there is value in the data because there are some commercial providers of it already. The ability to run national services like WriteToThem has proved useful to the general public and charities.
One of the reasons it’s hard to answer “why” is that the downstream uses of data can be hard to imagine. That being said, we need to be careful not to fall in to the trap of “if you build it, they will come”, as this isn’t true the majority of the time.
We’re not the first people to try this openly, so we should learn from past attempts.
We should commit to finding ways to demonstrate social impact.
Who should do this?
There is a chance not much will happen unless we do something about it, but the real question is should the state be doing this?
We think the answer is a strong yes and we have some ideas as to how below.
We also think that a good way of getting the state to create and publish this data is to lead the way.
The work we’ve done this week has given us some idea of the amount of work required to make a reliable list over the long term. We think it’s in the range of one or two full-time people. We don’t know exactly because we can’t test how much work is involved in long-term maintenance, but we’re fairly sure it’s more than 0 (fully automated) and fewer than five people.
We can also make a reasonable guess that it would involve some work each week, as representatives change all the time. We see a handful of by-elections every week, but there will also be name changes, party defections and other changes going on.
Even with a well-funded team, the data will still have errors from time to time.
This means that it’s worth spending some time making tooling for maintaining the data that a lot of people can use. In other words, this is a project that should be more like Wikipedia than The Encyclopedia Britannica.
mySociety are exploring this type of work through their Democratic Commons project. It’s at a very early stage and would need to involve a coalition of collaborators to properly take root. We’d like to give it a go, but working out how to do this in reality will require more thought.
We want to explore questions like responsibility, funding by share of work, how to value different types of contribution, whether grant funding could work with a commons model, and so on.
There are few patterns in this area to follow. Even knowing what to call this sort of project is confusing. “Commons” is one way of thinking about it, defined as “cultural and natural resources accessible to all members of a society” (with a nice list of patterns here. The ODI have started talking about Data Trusts. We could also call it a sort of “public good” or “public work”.
It could be that one organisation (possibly even a new one) gets to be a custodian of sorts. It could be that we don’t need a hard dichotomy between “the state” and civic society – a state funded “commons” or “trust” project might be the best of both worlds, as long as it was sustained over the long term.
This is something we want to think about over the next few months even if we don’t take this representatives project any further.
Campaign or product?
One danger with making something is that we create a “new normal” by doing a lot of hard work for almost no money. mySociety’s excellent products are so embedded and expected that it’s easy to forget that they rely on a handful of people and a tiny amount of funding. Our own products like the polling station finder are less used or mature but are starting to fall in to the same trap. We have no sustainable funding for services that are used by millions.
Grumbling aside, if we’re starting a new thing, we should make sure that we don’t commit to doing a lot of hard work without also presenting the work we’re doing to the state and making the case that they should be doing it.
That means that we need to think of this project as more of a campaign than a data service. The campaign is to get the state to provide this data, while also showing how it should be done.
We also need better ways to understand whether the campaign is working; if it isn’t, we need to know how to safely shut the project down.
How this state might do this
First a note on cost. We’re assuming it would take less than £100,000 a year to do manually, until the quality of the published data gets better. This is not a lot of money when compared to other projects. For example, the £5m that was given to Ordnance Survey in 2016 to “explore options” of opening up their data could fund this project for about 50 years. The £28m the government spends on posting election leaflets for candidates each general election would fund it for over 250 years.
There are a few implementation options. If the output is a single list of all representatives then it makes sense for some part of the UK government to take on the work. If we just wanted councillors then The Ministry of Housing, Communities & Local Government would be the logical place. They might not want to spend their time collecting Police and Crime Commissioners or members of the National Assembly for Wales though.
Different types of representatives could be collected by different departments. If that were to happen then at least the job of getting the full list would only involve aggregating around ten lists rather than hundreds.
In any case, a government department having the responsibility to collect and publish this information doesn’t make the fundamentally difficult parts of it go away.
They could hire a team of people to collect and clean the information, but that means it will always be following what local authorities publish, and might be a cause of errors.
They could maintain scrapers, but that’s just bringing automation into the process and doesn’t prevent the same errors as humans might cause (and would introduce new classes of error).
We think that the creator of data should be responsible for publishing it. In this case, that would be each organisation to which representatives are elected. This is because no one else is actually an authority on the information.
There are pre-digital ways of collecting fragmented data, such as: notices published in a gazette (notices of elections, striking off a company) or reporting requirements at the end of some timeframe. Both of these could work for representatives, but they introduce a layer of admin that isn’t needed for this sort of data in the digital age.
There are a few existing databases that are created through aggregation from local authorities. OpenDataCommunities contains some, and GeoPlace aggregate the Local Land and Property Gazetteer into AddressBase. Both rely on work to first report changes and then collect them and clean them. This work tends to create a lag around publishing or gaps in the data.
A better model would be to have each council maintain their list openly and make aggregating the lists easy to automate.
We don’t need to define the details of the solution here, but whatever it is will need to solve the three problems we’ve found this week.
The solution should involve using reliable identifiers for the things that already have them (parties, wards) and publishing in discoverable locations, making automation easier.
We’ve talked about service registers in the past as part of a solution. Other options exist, like the reporting requirements we saw for the local links manager.
Whatever happens, the solution needs to integrate fully into existing workflows and help the people who use them do their job. This might involve working with CMS suppliers. ModGov is already doing well in publishing the data in an API.
A good next step might be to partner with some councils who are keen on trying things out.
At the end of our week exploring if we should carry on with this project, we’ve come up with a really well informed “maybe”.
We know it’s not free and that we can’t (and shouldn’t) try to do this on our own. James McKinney writes about some of the traps this sort of project can fall in to, and the real answer is that there are no shortcuts to getting this done.
We also want to work out how to run a civic tech project whose aim is to stop existing in a short timeframe. This means being great at communicating complex technical aspects like stable identifiers and predictable URLs to those in power who can do something about improving the situation.
The good news is that we’re not asking for huge sums. Even in a political environment not known for throwing cash about, we think this is a tiny fraction of a governmental programme budget. The better news is that it’s not that hard – we think a tiny team could deliver it.
The next thing we plan to do is write up the campaigning and business case, trying to fill in the gaps in the social impact. We want some sort of statement of intent and a plan. Ideally we’d like some more partners in the form of campaigning groups, civic society, local authorities or central government.
Please get in touch if you are interested in talking about this project.
Photo credit: mdpettitt