Whereas many of our projects rely on crowdsourcing to assemble a dataset, Where Do I Vote relies on councils to publish or share data with us. A small handful of councils proactively publish their data. For example, Lichfield publish their data on data.gov.uk and Calderdale publish their data via their own portal. However, most councils don’t publish this data and in order to provide a useful level of coverage, we need to obtain data from as many councils as possible.
Our first approach to this problem was to submit a Freedom of Information request to every council requesting that they release their polling station locations and polling district boundaries. We blogged about this back in 2015. Since then we have changed the way we tackle this problem and now seems like a good time to write a bit about how and why we have changed our approach.
Although we were able to obtain a large volume of useful data using FOI, we quickly realised that there were several problems with this approach:
- FOI is optimised for one-time publication, or obtaining data at a single point in time. In order to keep track of a dataset which changes over time, you must repeatedly go through the Freedom of Information process.
- Submitting FOI requests put us in contact with information officers rather than electoral or GIS teams. This made it difficult when we needed to ask questions or clarifications because we were not in direct contact with the relevant data custodians/domain experts.
- Although applicable organisations are obligated to respond to requests within 20 working days, FOI can be a slow process.
- While a large number of councils did release information, some also declined to do so on the grounds that Electoral Registration Officers are not a public authority. The Electoral Commission provide guidance advising “where possible, Electoral Registration Officers should disclose the requested information, provided it is already in the public domain or does not include personal data”, but we find ourselves in a somewhat murky corner of the FOI act.
- Requesting data via FOI can be considered a confrontational approach.
As a result of our experiences in 2015, we realised that we would need to work more closely with Electoral Services and GIS teams and pursue a more collaborative approach. Where we were able to find councils interested in working with us, this resulted in faster turnaround times and made it much easier to follow up questions or problems with data. Most importantly, we started to build relationships with people directly interested in what we are doing. Once we started to build up these connections, we found that some councils would notify us of last-minute changes or contact us proactively to provide data for future elections. This would never have been happened if we had relied on Freedom of Information requests.
This represented an improvement over our previous approach, but we still needed to refine our tactics. When we started requesting data from councils, we assumed that a dataset of GIS polygons representing the polling district boundaries and a corresponding dataset of points representing the polling stations was the most sensible format to request. This was partly because the handful of councils proactively publishing open data were representing their data in this format, so we assumed this was the natural or most common format for the information to be stored and used in. Data in this format also worked well in the context of our software architecture. However, as we expanded our network of contacts to encompass a wider range of local authorities, we started to see a number of recurring pain points:
- Producing data in this format requires collaboration between electoral and GIS teams
- GIS teams in councils are very busy and respond to requests from a range of different internal teams
- The level and availability of GIS expertise varies considerably across councils
- Electoral teams were sometimes keen to work with us but unable to do so due to capacity issues or lack of access to applicable expertise within their organisation
- Data in this form leads itself to a number of data consistency problems, requires careful checking and often requires follow-up queries
- GIS teams were sometimes not aware of updates, particularly last-minute changes to data and providing updated data could be difficult in these circumstances for some councils
- Open data like this is seen as a ‘nice to have’ so might be one of the first things to be forgotten about at busy times.
We realised that in order to overcome these issues we needed to work more closely with electoral services teams and make it as easy as possible for electoral officers to provide us with data, regardless of their access to GIS expertise within their organisation.
We knew that councils managed this data in some way because it’s needed to print poll cards. After some searching we found that each council uses “Electoral Management Software” (EMS) to manage everything from poll cards to registration.
If we could get the same export from this software as the poll card printers were sent then we were much more likely to be able to obtain the correct data directly from electoral officers.
We were invited to join and present to the annual conference of the Association of Electoral Administrators, which all the software suppliers attend. We used this both as an opportunity to pitch to authorities themselves, but also to introduce ourselves and chat to each software supplier.
By reaching out to Electoral Management Software suppliers, we were able to work with three of the four suppliers (Halarose, Xpress and Democracy Counts — supplier Idox being the odd one out) to make it easier for their customers to export data directly from their system (essentially an anonymised electoral roll with polling station data attached to it) that we could use to provide coverage for their area. Our thanks to Halarose, Xpress and Democracy Counts who all added additional features or documentation to their software to help their customers provide this information more easily.
Consuming the data which is easiest and most natural for electoral services teams to produce means we are now talking their language. This has been our most successful strategy yet. This is now our preferred method of receiving data from councils and this is reflected in our guidance for councils although we are still happy to work with GIS data from councils who prefer to publish in this way. By working more closely with the teams who are directly invested in the outcomes of our project:
- We are in direct contact with the right people to follow up queries and clarifications
- Data custodians who are responsible for making changes to the data can easily supply us with updates
As a bonus we are now dealing with a much more constrained range of data formats and the process of obtaining data from councils is many times faster than using Freedom of Information requests.
So what have we learned from this process?
- The Freedom of Information Act is an immensely powerful instrument but it isn’t always the right approach.
- As a data consumer, find ways to ‘speak the language’ of relevant data custodians and align your objectives.
- Continuously re-evaluate your approach to solving problems. As we learned more about the problem domain, we changed our approach several times before finding the best strategy.
- If your data doesn’t come from the custodian directly then you need double check it is correct. A lot of data is published once by well-meaning people, but it can go stale quickly.