Data Science + Ethics

Ethics in dealing with data. 

A paper on Human Data Interaction : by Richard Mortier, Hamed Haddadi, Tristan Henderson, Derek McAuley, Jon Crowford: Human-Data Interaction: The Human Face of the Data-Driven Society … http://arxiv.org/abs/1412.6159.  A nice summary in MIT Tech Review.

A conference at MIT on Digital Experimentation: October 10-11, 2014:  www.codecon.net.  Program here.

A conference at Stanford, September 15-16, 2014: Ethics of Data in Civil Society.

A nice talk in Data.Stories with Kate Crawford of Microsoft Research. Data.Stories is a great podcast series on data visualization by Enrico Bertini and  Moritz Stefaner.

Information Week (October 2013): contains an interview with the founder of a young organization calling itself, the Data Science Association.  A quote from the article, the founder says:

“Things were really getting out of control in terms of the definition of ‘data science,'” said Walker in a phone interview with InformationWeek. “A lot of people who really weren’t data scientists started calling themselves data scientists. And I saw a lot of data science malpractice in the companies, or clients, that we work with.”

Here’s a link to the Data Science Association’s Code of Conduct.

Here is an extract of the code of conduct …. point F … just to show you how thoughtful these folks are.

(f) A data scientist shall not knowingly:
(1) fail to use scientific methods in performing data science;
(2) fail to rank the quality of evidence in a reasonable and understandable manner for the client;
(3) claim weak or uncertain evidence is strong evidence;
(4) misuse weak or uncertain evidence to communicate a false reality or promote an illusion of understanding;
(5) fail to rank the quality of data in a reasonable and understandable manner for the client;
(6) claim bad or uncertain data quality is good data quality;
(7) misuse bad or uncertain data quality to communicate a false reality or promote an illusion of understanding;
(8) fail to disclose any and all data science results or engage in cherry-picking;
(9) fail to attempt to replicate data science results;
(10) fail to disclose that data science results could not be replicated;
(11) misuse data science results to communicate a false reality or promote an illusion of understanding;
(12) fail to disclose failed experiments or disconfirming evidence known to the data scientist to be directly adverse to the position of the client;
(13) offer evidence that the data scientist knows to be false. If a data scientist questions the quality of data or evidence the data scientist must disclose this to the client. If a data scientist has offered material evidence and the data scientist comes to know of its falsity, the data scientist shall take reasonable remedial measures, including disclosure to the client. A data scientist may disclose and label evidence the data scientist reasonably believes is false;
(14) cherry-pick data and data science evidence.

Good to see IBM’s Big Data Evangelist James Kobelius on Data Science and the need for a Code of Conduct.

 

 © 2014-2015 ContextBridge, Inc.

Leave a Reply

Your email address will not be published. Required fields are marked *