skip to content

Back to the Future of Data Reuse

Three weeks ago, we published a blog post with some new draft terms of use for Democracy Club’s data. Lots of people kindly provided feedback. It’s all quite tricky. This blog post tries to summarise some of the issues and our thinking, then states what we’ll do next.

In sum, data licences won’t change and access to the CSVs won’t change. Access to the API will change, in a way that we hope gives more clarity to users than we do currently.

🍋 Difficult difficult, lemon difficult 🍋

There’s no perfect way forward here — merely a series of tradeoffs; known unknowns and quite a few unknown unknowns. There are things that could go badly or good things that could be unknowingly prevented. We also need to move quite quickly as we sense an election a-coming.

It’s certainly tempting to throw everything completely open and say ‘go for it’ — subject only to a rate limit on the API to prevent it falling over. Generally, Democracy Club tries to be pretty optimistic about openness.

However, we do want to maintain the ability to understand who is using the API and how much, we want to be able to get in touch with users to chat about improvements, impact measurement and — in a worst case — to prevent bad things, we do want to be able to turn access off.

To some extent, we can shape or guide reuse through the design and structure of the API itself and through good communication to those using it, rather than through legal terms. For example, an enthusiastic newcomer can’t be expected to understand all of the weird ‘edge cases’ that we know about through years of working on this stuff, so we can highlight them in the API documentation. Or, similarly, where we don’t really want people to cache a result for a polling station (because your polling station changes), we can do something technical to communicate to humans and machines what we think the cache length should be, rather than to try to specify an approach in the terms.

🏓 Q&As 🏓

Thanks again for all the comments. A few are highlighted below.

What use could be so bad that we’d need to switch access off?

Typically, what we’re afraid of isn’t malicious use, but misleading use. For example, the polling location data could be used to send people to the nearest polling station instead of their allocated one. Or with candidates data, someone might decide that voters only need to see the top two in every constituency.

It’s possible to imagine malicious use too: it wouldn’t be too difficult for someone to use the data to create a better-funded voter information app that tracks users more closely and sells that information on — or selectively presents information to users based on what the owners of the app are trying to achieve.

How are you actually going to learn about problems — and what can you really do to enforce any terms?

Currently, we have a relatively small number of users of the API, so can keep an occasional eye on the projects that use it. If we had many more users, we’d realistically have to rely on people spotting problems and alerting us to them.

As was pointed out, in some cases, naming and shaming could be our most powerful course of action. But being named and shamed by a tiny non-profit might not matter to someone vastly better resourced, or indeed, anyone. The power to switch off access to the API does seem like a fair tool to keep in our locker, so that at least we can show we took some action and prevent some further harm, even if by then the damage is done. Our ability to do this shows a certain seriousness to those who provide the data for the API too.

Isn’t any term of use a barrier to reuse?

Yes, but our goal is not to maximise reuse. Our goal is to get people better access to election information.

We do not have that many users of the API at the moment. It doesn’t seem unreasonable to gather some contact details so we can get in touch to ask about what people are doing with the data. We can have a chat, we might be able to help them be more effective at helping democracy. Happy days. So far, nobody’s told us our terms are too strict to build things with (of course, it’s possible that they just don’t tell us).

Does reporting the API use really help prove impact and support funding applications?

It’s true that impact is fuzzy at the best of times, but it is good to be able to understand where traffic is coming from, in order to focus resources. Again, having this information doesn’t seem an unreasonable quid-pro-quo for the use of the data. It shouldn’t be a dealbreaker though, so if an organisation wants to avoid this, they can get in touch to discuss different terms.

Ultimately, we hope that a publicly funded body will take on the production of this infrastructure stuff. They’ll be much better placed to ensure its sustainability and to have a think about openness levels. The Candidates data is produced by volunteers wombling away in their spare time, but to a platform that is built by paid developers. Joe liked Jeni T’s idea of throwing open the data and then charging organisations to use a Democracy Club kitemark on their projects, but the market is almost certainly not there.

🎩 So what does this all mean? 🎩

Currently, anyone can already download one of many thousands of CSVs of candidates data. That will continue. So much data has already been added on a CC-BY licence that it could be impossible to slice it up correctly anyway. However, we will add a note to the CSV downloads pointing out that it’s personal data and so users should take note of the Data Protection Act, particularly with regard to s.8e. Thanks to Sam for spotting. We’ll also ask people to please let us know how it’s being used.

In terms of APIs, we’ve had various rules in place across various APIs previously. We’re now moving to just one API that does everything — and at the moment, we ask people to get in touch for a key. This is going to change. The new plan is that anyone who agrees to the below terms will be able to complete a form on our website and be automatically given an API key. There won’t need to be any interaction with the team here. The only rate limit that will apply will be in order to protect the thing from falling over.

In exchange for a key, users will agree to:

  • Attribute Democracy Club, ideally with our logo and a link to our site;
  • Where possible, send Democracy Club a link to whatever is created and any evidence of impact;
  • Allow Democracy Club to publicly share the number of API requests made.

We will say that if these terms aren’t suitable, users can get in touch to discuss different terms.

We will also reserve the right to withdraw access if the data is used in such a way that:

  • Misleads voters (for example, an app to show users their nearest polling station, instead of the correct one allocated to them based on their registered address);
  • Harms Democracy Club’s ability to deliver the data in future (e.g. by weakening the trust placed in us by volunteers or electoral administrators):
  • Harms the perception of Democracy Club as a non-partisan actor;
  • Harms the perception of the accuracy of Democracy Club’s data;
  • Breaks the law (e.g. by use personal data of candidates for purposes other than an activity in the public interest, see s.8(e) of the Data Protection Act 2018)”

Thanks to those who suggested ‘harm democracy’ was too vague and that we needed some examples. Hopefully those help.

We’ll also provide a bit more on the process of what happens if things go wrong:

“In the event of problems brought to our attention:

  • Democracy Club will raise the issue with the user, remind them of the terms and suggest steps to ameliorate the issue.
  • If steps are not taken — or there is no response from a user — the issue will be reviewed by three board members and access may be removed. The executive director(s) have the option to immediately suspend use before the review if they believe significant harm is occuring.
  • A final review can be undertaken by the full board if the user requests it.”

So that’s where we’ve got to for now. Thoughts and feedback still very welcome on the Google Doc version of the terms.

We’ll need to do some work to implement this new approach — hopefully we’ll be able to do that before the next election…depending on when it arrives. And once we’ve recorded the contact details of some users, we’ll be able to do more user research too. We’ll set a reminder to review how things are going in 12 months’ time.

📅 What’s next? 📅

Normal blogging will resume next week, but if you’re wondering what else we’re up to, check out our current tasks via Trello.

And there seems to be even more noise about a possible general election than usual…we’re looking for a partnerships manager to help us out in the event of said election. Please share it around!

Joe is excited to be heading to Glasgow tomorrow for a conference on civic education as delivered across Europe. He’s hoping to steal lots of ideas for making democracy better in the UK.

Forward!

🐾

Get in touch:

Jump into the online chat in Slack, tweet us, or email hello@democracyclub.org.uk.