The morning’s session began with Professor Karen Yeung from King’s College chairing. She is co-author of Law, Regulation and Technology: the Field, Frame and Focal Questions. Isn’t that interesting for us in the Law Special Interest Group at the British Computer Society?
He suggested that
- A ‘census of buildings’ would be valuable information!
- The types of buildings would give information about their vulnerability or resilience!
A whole new world opens for our new visualisation styles for images, multi-dimensional data and time series. That’s for sure!
The second talk was given by Professor Kobbi Nissim of Georgetown University on Differential Privacy and how it compares with Legal Standard of Privacy. He distinguished between:-
- technical and social aspects of privacy and referred to
- FERPA which stands for Family Educational Rights and Privacy Act as a legal privacy standard passed by the US Congress in 1974 and he asks:
- “who is the privacy attacker?”
- a ‘reasonable person who does not have personal knowledge of the relevant circumstances’.
- A FERPA security game would challenge an attacker to be able to identify a person from published information.
Data Science for Public Sector was the next talk – by Slava Mikhaylov of the University of Essex. He started with
- school readiness
- and added the ‘system approach’ aka ‘Quadruple Helix Innovation’ to include:
- government / public
- benefits for academia
- he quoted Francois Chollet [@fchollet] who works on Deep Learning at Google: “… tends to benefit established large companies, rather than nimble upstarts with better tech.”
- he presented the Challenge Lab – where master students take on the planet’s biggest challenges, together with industry, government and academia.
The discussion offered this ‘plug’ for the BBC Data Science Research Partnership.
The automation of political communication on Twitter: the case of the Brexit botnet was next – where Dr Dan Mercea presented the abuse of Twitter to amplify LEAVE tweets:
- 10M tweets were collated;
- 794,949 Twitter profiles;
- 40,031 accounts (5%0 of all users) were deactivated, removed, blocked, set to private or whose username was altered after the referendum.
The drop off of tweeting after the referendum was soo noticeable that they called these tweets ‘bots’ rather than originating from people:
- a fictitious identity, in breach of social media ToS;
- relying on thresholding and filtering approaches;
- ‘cascading’ by retweeting!
E.g. account names were webClient, cascadeMean, cascadeMax, ccdMeanTime, commonWords, tw2user, ts2rtMean.
Also the response time to tweets was unusually short!
Here comes the ethical dilemma:
- should we point out who the bots are?
- one account with 175K tweets was classified as a ‘bot’;
- a ‘bot farm’ was used where users customise their bot and select the time of usage.
More important insights:
- deleted accounts were more active than active user accounts;
- they posted 1 retweet for every tweet;
- Vote Leave was mentioned by far most;
- the number of external dead links was much higher than valid links;
- the number of valid Twitter links was higher than the dead links.
So here we have a planned campaign of fake news:
- 83% of accounts in the botnet had been created in the previous 2 years, compared with 43% for the subset of active accounts and 48% for accounts that ended up being recycled;
- tweets posted by bots were characterised by the absence of seasonal patterns and posted at ‘odd hours’;
- bots triggered large message cascades very quickly after a tweet.
Curated retweeting was visually compared with ‘botnet amplification’.
- Bot activity was arguably successful;
- Bots aggregated and retweeted content tweeted by seed users which may conceivably be bots themselves: ‘false amplication’;
- Botnet failed to generate any large cascade of 1K retweets, while the active user base successfully generated nearly one hundred such cascades;
- Bot detection is work in progress!
The paper that resulted from this fabulous research is The Brexit Botnet and User-Generated Hyperpartisan News.
The afternoon concentrated on Data Analytics for Human Health and started with Chloe-Agathe Azencott on Machine Learning and Genomics: Precision Medicine vs Patient Privacy.
As a member of society, funded by public money, these are the issues that are important to her:
- Legislation against genetic discrimination
- Article 21 of the EU Charter of Fundamental Rights contains “genetic features!”
- How can we protect genomic privacy?
- de-identifaction is not enough
- Ethical solutions
- privacy is dead!
- trust not privacy!
- P4 Medicine: Preventive, Predictive, Personalised and Participatory!
- How far can we trust algorithms?
- Beware of genetic determinism!
The last speaker was David Madigan on Empirical Calibration for Effect Size Estimation on Observational Healthcare Studies.
- How does evidence-based medicine + clinical intuition work in practice?
- Population-level estimation
- Patient-level prediction / Precision medicine
- Clinical characterization.
I ended up feeling exhausted from excitement! But Citizens’ Juries are a new concept I would love to see acted upon and I sincerely hope that the Royal Society will address this topic for the common good on a regular basis.