(By Max Brennan, Colorado Law 2L)
This week’s blog post examines the concept of algorithm bias. It begins with a definition of algorithm bias, turning to its interactions with the law, some real-world examples of bias, and ends with considerations for future legal treatment of algorithm bias.
What is Algorithm Bias?
With the rise of big data, governments and private entities have developed tools to analyze and action this data. Big data as a concept is defined by a high volume of information, high variety of information and sources, and a high velocity of that information.
To generate actionable information from these large, constantly updating data sets, governments and companies employ algorithms. Algorithms are generally defined as a self-contained step-by-step set of operations that perform calculations, data processing, and automated reasoning tasks.
Importantly, many algorithms used to perform these tasks are generated and continually changed using machine learning. Machine learning was defined in 1959 as a “field of study that gives computers the ability to learn without being explicitly programmed.” In modern computer science and data analytics, machine learning can be defined as a “field of scientific study that concentrates on induction algorithms.” This means that the computer program can change the way it performs tasks on the data based on the data rather than human input.
Algorithm Bias and the Law
In a 2014 report discussing big data, the Office of the President published a report stating: “the civil rights community is concerned that such algorithmic decisions raise the specter of ‘redlining’ in the digital economy—the potential to discriminate against the most vulnerable classes of our society under the guise of neutral algorithms.” Redlining was a historical practice of “mortgage lenders of drawing red lines around portions of a map to indicate areas or neighborhoods in which they do not want to make loans.” Under the Fair Housing Act it is currently legal to discriminate based on economic factors but not on race, religion, national origin, sex, or marital status.
One concern with algorithms used to generate scores about people, especially those algorithms developed using machine learning, is that they might result in a disparate impact without explicitly discriminating based on one of the prohibited classifications. This means that the algorithms will not be discriminatory on their face, but rather discriminatory in their effect. For example, based on consumption patterns, discrimination in employment could occur based on products statistically purchased more by women.
In a 2016 law review paper, researchers found that despite Title VII prohibiting discrimination in employment, courts have sanctioned the use of algorithms if the algorithm’s “outcomes are predictive of future employment outcomes,” despite the input data potentially embodying historical discrimination. The Office of the President’s report continues:
For important decisions like employment, credit, and insurance, consumers have a right to learn why a decision was made against them and what information was used to make it, and to correct the underlying information if it is in error.
In an era of complex algorithms and machine learning, a human might not be able to justify a particular decision which was made on the basis of a high volume and variety of data manipulated by a learning algorithm.
Real-World Examples of Algorithm Bias
There are many real-world examples showing algorithm bias, with varying degrees of relevance to the current legal framework and the development of the law in this area. A 2015 study showed that setting a user’s gender to female in the Google search context resulted in a statistically significant reduction in the number of advertisements related to high paying jobs as compared to setting the gender to male. Another 2015 study showed that customers of the Princeton Review’s highest tier online tutoring service (offered at four different price points) were 1.8 times more likely to be charged higher prices if they lived in areas with a high density of Asian residents.
One of the closest interactions between algorithms and the government is described in a 2016 article about computer algorithms used to score the likelihood of an arrested individual committing future crime. In Florida, a company supplies software to Broward County that uses 137 questions and a proprietary scoring system to generate a score meant to embody the likelihood that a person commits a crime in the future. Holding many factors constant, “black defendants were still 77 percent more likely to be pegged as at higher risk of committing a future violent crime and 45 percent more likely to be predicted to commit a future crime of any kind.” However, this scoring system has shown that only “20 percent of the people predicted to commit violent crimes actually went on to do so.” Judges are provided with these scores and they factor into sentencing. This implicates due process considerations, as defendants cannot question a proprietary algorithm on the stand.
The future of Algorithm Bias and the Law?
In a 2016 New York Times essay, Danielle Keats Citron argued for regulatory oversight of algorithm outputs used by both government and private entities which factor into decisions concerning life, liberty, or property. She argues for a “technological due process,” potentially administered in the form of a Federal Trade Commission audit of these algorithmic scoring systems.
Can we legally account for historical discrimination in the input data, or is this more of a computer program architecture issue? As a faster, larger volume of varied data enters these algorithms employing machine learning, how do we ensure that disparate impact does not become a problem? What private or public entity might be best positioned to address these issues?