Welcome to our second post of 2017. Busy times. There are just months to go until a range of local elections across the UK and, what’s more, this week it was announced that elections will be held to the Northern Ireland Assembly at the beginning of March. The crowdsourcer is up-and-running for Northern Ireland and it’s coming soon for those local elections.
Much democracy. Many elections.
As we get everything together in order to start crowdsourcing candidates for May, it’s a good time to talk about the infrastructural stuff we’re doing to make it all more sustainable — to improve the prospects of truly capturing and opening all data relating to UK elections.
We’ve blogged before about database of every election in the UK — but in today’s blog there’s the behind-the-scenes read from Sym, who explains in more detail how we are building it, and some of the challenges with doing so.
Also in today’s blog: a quick shout about election results — who needs them? And who wants to help? Read on…
The Making of Every Election
Why are we creating a database of every election? Because if we can automate as much of this stuff as possible then we can start automating our ‘upstream’ services like the database of candidates. At the moment we have to manually set up the crowdsourcing site for each election. With this service we can pull in the divisions of an organisation and other data automatically.
Currently, there’s no official list with a simple ID system for elections. Democracy Club isn’t well placed to create such an authoritative ID system from the outside, so we’ve invented a system that allows anyone to create an election ID, using as much semantic information about the election as possible.
It’s quite complicated, involving brand new datasets and IDs. Here’s how it works.
The first thing we did was to create an ID system for elections themselves. Normally an ID should only be issued by the authority that makes the thing (the election in this case). The problem is that there is no single body that creates elections in the UK. The Electoral Commission provides guidance and monitors spending; district or borough councils administer elections; there are Statutory Instruments that can cause them to happen; each organisation can decide to hold them; and so on.
What do we mean by ‘election’? Take the Police and Crime Commissioner elections as an example. Is the ‘election’ the collection of all PCC elections on a single day, meaning you have a single ID like
pcc.2016-05-05? Or is there an ID per Police Authority, meaning you get
pcc.avon-and-somerset.2016-05-05? What about in the case of a by-election for a single ward in a local authority?
Because an ‘election’ can refer to more than one thing we decided to issue an ID per unique ballot paper, as well as group IDs. For example, for a single by-election in Abbey ward in Cambridge we issue three IDs:
This makes it easier to find groups of elections. For example, every local election happening in Cambridge that day, or every local election happening across the UK that day.
The first part of the ID is the type of election. For example,
local for a district or borough council election, or
parl for an election to the House of Commons. At this moment, we’re only covering election types mentioned in that link. But watch this space.
The next part of the ID is the organisation. We define an “organisation” as anything that can have people elected to it. For this bit of the ID, we need a list of organisations for each type of election.
We use various sources of organisations. For local authorities in England we use the beta register, or the ‘discovery’ registers for Scotland, Wales and Northern Ireland. We use data.police.uk for police forces. We hard code the mayoral elections, UK parliaments and the Greater London Authority.
Here’s where it starts to get a bit complex. Most of the organisations divide themselves into smaller units and elect one or more persons per unit. This isn’t true of how PCCs or mayors are elected as there is only one person covering each organisation, but it is true of local and parliamentary elections.
The complicated bit is that these divisions change over time. In order to ensure that people get more or less equal representation, and to prevent Rotten boroughs, we have independent boundary commissions for each nation in the UK, for both parliamentary constituencies and local authorities. Each commission will periodically review the electorates in each area and review how the organisation is geographically divvied up to ensure that each subdivision is of roughly equal population size.
They have the power to create Statutory Instrumentsthat abolish one set of divisions and create a new set. When this happens a new election for the whole organisation is triggered, normally scheduled for the next time a regular election is due.
This causes us a headache for a number of reasons. First, we need to know the “start” and “end” date of each set of divisions and to make sure that when we create election IDs that we’re making them for the correct divisions (based on the date).
Second, the information we need on these divisions isn’t always published nicely. We need the following information about a division in order to make it useful:
- The name
- The number of people who can be elected in that area
- An Identifier
- The geographic boundary (in a data format like Shapefile or geoJSON)
The first two parts are published in the Statutory Instruments (see, e.g., Schedule 1 of the Cambridgeshire (Electoral Changes) Order 2016 and we’ve pulled all the data for the recently completed changes out of legislation into a public spreadsheet.
For the identifier, the boundary commissions don’t create official ones, so we need to rely on something else. The next best thing to an official identifier is the code issued by the Government Statistical Service. The problem is that they don’t publicly issue codes for the new areas at the point they are created in law. It’s not clear when they issue them, but in 2016 they hadn’t issued public IDs for the new areas with elections in May until the start of April.
We got sent them early, after meeting someone from the Office for National Statistics in a pub.
So we can’t rely on GSS to issue IDs as quickly as we’d like. Because of this we need to invent an ID that will do until we get a better one. We take a crude approach here and just join the organisation identifier to the slug of the name of the division.
Geographies are interesting too. The boundary commissions create the geographies, but claim that they are unable to publish them as open data because of Ordnance Survey (the great vampire squid wrapped around the face of UK public-interest technology). They argue that because they draw on OS ‘base maps’ in order to create the new geographies they are unable to licence them freely. This may or may not be true: OS don’t make it easy to understand what can and can’t be done with their data. We think that the data is able to be opened, but it’s so complex that we can understand that they would rather be closed, just in case they get sued by another part of the government. Legal arguments aside: we don’t have the data.
The data is eventually published by OS themselves. They only include “active” geographies in their ‘boundary line’ product. This means that, because of their 6 monthly update cycle, the boundaries could first be published months after an election. We wrote about this last year and the followed up with the statement from OS.
If we were to design this system from scratch, we would give the boundary commissions the duty to create standard IDs themselves and the freedom to publish the shapes of the boundaries they make as open data. Relying on three government organisations to work together on this will never be painless, however much individuals in each organisation try.
None of this should stop us creating election IDs though, and we can punt the problem of lack of data off for now. As long as the system doesn’t rely on either GSS or OS to work, we should be fine.
The last part of the ID is the date. An election has a start date and sometimes a different end date — although all current UK elections only have a single polling day. We put the date in the ID as an easy way to distinguish it from other elections that happen, but this causes a problem: sometimes we know that an election will happen, but we don’t know the date. This ‘unknown’ only lasts a short time, such as in the weeks just after an MP has stood down. We still need to create an ID for this election, so we make a temporary ID, which is replaced by a proper ID when we know the date. This is far from perfect, but it allows us to log elections earlier.
Pulling it all together
Once we have all this data together in a system, we need a way to create IDs. We made a ‘wizard’ style form that walks you through the process of creating IDs. Give it a go if you like, and let us know how you get on.
Let us know if you’d like to use the data – we’re working on an API right now. If you know of elections that we might not have added, please go and make them on the site. And we’re keen to get feedback on all of this!
Who loves election results?
We do, of course. And so does the Local Government Information Unit. We partnered last year to prototype a results recorder under the splendid moniker ‘Out for the Count’.
We’ve ironed out a few gremlins and want to test the system again in 2017. But we think there are others who must love open election results data too. We just need to find them.
Psephologists? Academics? Would you benefit from open results data? AP/BBC/Sky/ITN, journos generally — would you benefit from this?
The Open Data Institute helped us out with some funding last year, and we’re on the lookout for financial support for this project. But more than that, we’re looking to hear from people who would find it useful.
Pre-election season is definitely underway: as well as pushing forward on all of the above, we’re also doubling our efforts (with the help of the Electoral Commission) to increase coverage of the polling station finder in time for May. We’re getting some governance stuff in order, trying to design a subscription offer, thinking about a bit of a restyle, and planning for some conferences around the UK in February. Phew.