Source: Deep Learning on Medium
Today I officially join a distributed ledger startup in London called Gospel Technology. Gospel built a graph database on a distributed ledger made for sharing super-secret PII data in a GDPR compliant way.
Super secret, personally identifiable information, being shared — correct.
Last year, the excellent data science team at Peltarion were building deep neural networks for customer, and our biggest problem was getting our hands on good data — or getting it to the cloud for training. Anonymization only went so far and it was often too easy to identify someone from a handful of combined data points even in anonymised data. It became very clear that data custodians were simply not going to share their customer’s data. To be fair, those data custodians and decision makers were right since to date “sharing” just meant “copying”. And as we all know from the age of mp3s, a copy can travel really far. That’s no way to treat someone’s medical records or list of bank transactions. Yet this kind of information need to be shared for it to be useful to you, to build better prediction models and give insight to businesses who need to plan and optimise how to lower costs and serve their customers better.
For some things, this sharing is already happening (often badly implemented, but at least it’s happening)
Medical records — GP’s notes can be good context about an emergency taking place.
Bank details – by sharing with credit scoring agencies, banks can stop giving loans to bad actors who drive up risk, driving up prices.
But in many cases it is not yet happening due to lack of expertise or focus on the part of smaller data owners. Gospel is enabling it. One of my favourite use cases is with Worcestershire Office of Data Analytics (WODA): A social worker can see parts of very sensitive information about children in care. Imagine you are a social care worker and there are a 100 children on your list. Little Amy is number 15 on the list and even though it is completely overkill to share Police and NHS records connected with those children, with everyone in the social care department, you can soon share metadata on events in said private data, to a handful or people, right from the data-layer. Imagine Amy’s foster parent was involved in a road rage incident 3 counties away, and later that week a GP is looking at a bruise on Amy’s arm, worried about how anxious and upset her foster parent was. What’s happening in this woman’s world? Is this drug related? Is Amy is in trouble or is this poor mother just having a bad week? You can now spend your next 30 minutes calling that mother to help. As it stands today, you don’t even know of this woman’s anguish and if you got wind of anything, you’d spend 3 days on the phone getting access to NHS or Police records.
Last year was about using data to build deep learning models to drive humanity forward — and I found there was a real problem with data — businesses didn’t have enough. Amazon, Google, Facebook — they have good data. But most smaller firms and councils wanting to do good, they need to share. This year is about helping people protect+share their data and doing the best by their customers while enabling the oil of our time, data, to be used for good. With any luck we’ll be building better models on shared data in a GDPR compliant way soon.
In a follow-up post I would love to talk about why this fancy new sharing database actually works and go into the patent that underpins it, but for now I am very happy to start work on a product that finally implements the utopian ideal of trust as described in a legal contract. It took millions of pounds to create, built on among others, open source tech from my alma-mater google and an ingenious way to secure how data is read from an underlying blockchain/distributed ledger.
Let the sharing begin