Good Cops, Bad Data? The Social and Ethical Dangers of Data-Driven Policing

Dia Porter

Abstract: Police departments across the nation have deployed the use of predictive software models and computer algorithms to automate their policing strategies, a practice collectively referred to as data-driven policing. While the algorithms have the potential to reduce bias, increase efficiency, and improve crime prediction accuracy, they also pose the threats of reinforcing racial biases and patterns of inequity. With automation and algorithmic technologies continuing to become integral tools throughout the criminal justice system, it becomes critical for policy makers to evaluate and assess the social ethics and validity of data-driven policing.

I. Introduction: The Rise of Automation

The American public has become increasingly vocal about their demands for criminal justice system reform. As cries to “abolish” and “defund” the police continue to catch fire across the nation, government officials have scrambled to find ethical and objective methods to address continuing instances of systemic racism and improve the American criminal justice system. Twenty-first century police tactics, such as community policing and stop-and-frisk, were deployed by law enforcement entities with the expressed intent to improve policing efficiency and crime prevention effectiveness (Koper et al. 2014; Moravec 2019). However, as policing strategies have continued to evolve, the role of technology has become more prevalent – with seemingly non-existent oversight protocols.

Since the late 1900s, United States (U.S.) law enforcement agencies – throughout all levels of government – have been urged to integrate technology into their policing practices. In 1965, President Lyndon B. Johnson established the Commission on Law Enforcement and Administration of Justice in response to the “state of emergency” presented by the nation’s rising crime rate and general state of the American criminal justice system. After two years of in-depth review, the Commission’s 1967 report, “The Challenge of Crime in a Free Society,” called for the application of modern technologies to support the initiative of improved law enforcement practices; stating, “[Technology] can provide considerable help to law enforcement. We must [assess the] devices we want relative to the price…Science can provide capability, but the [public must participate in the decisioning] of whether or not the capability is worth its financial and social costs (The Challenge of Crime in a Free Society, 1967).” Today, law enforcement continues to dedicate vast portions of their annual budgets towards the development, acquisition, and deployment of data-driven policing measures. While law enforcement has surely met the 1967 objective of increased data analysis and technology reliance, the criminal justice system has largely missed the mark regarding the social costs of data-driven technologies.

II. State of the Nation

The U.S. criminal justice system is presently comprised of nearly 2.3 million people across 1,833 state prisons, 110 federal prisons, 1,772 juvenile corrections facilities, 3,134 local jails, 218 immigration detention facilities, and 80 Indian Country jails as well as in military prisons, civil commitment centers, state psychiatric hospitals, and prisons in the U.S. territories (Sawyer & Wagner 2020). Minority group overrepresentation is prevalent throughout all correctional facilities. While, only 12 percent of the U.S. population identifies as Black or African-American (U.S. Census Bureau 2020) – of the over two million persons currently incarcerated in adult correctional facilities (including children who have been charged as adults) – nearly 40 percent identify as Black or African-American [1]. Other people of color (POCs) – including those identifying as multi-racial, Asian American, Native American, and members of the Latinx community – comprise an additional 28.5 percent of the “adult” correctional facility population (Prison Policy Initiative 2020; Sakala 2014).  In 2019, the respective arrest rate per 100,000 for Black and white Americans was 5,723 and 2,750 (U.S. Department of Justice 2019). Reviews of officer traffic stop reports show that Black people are stopped and arrested at over twice the rate of white Americans. Further, despite driving 16 percent less than whites, Black people are 63 percent more likely than whites to be stopped for traffic violations; when accounting for their reduced road time, the likelihood jumps to 95 percent (Baumgartner et al. 2018).

Ongoing public scrutiny concerning police brutality and killings, racial bias and community surveillance, and general lack of officer accountability and transparency have amplified the need for a review of several traditional and  new-age policing tactics. One method law enforcement departments have sought out to effectively and economically address increasing levels of civil unrest, and community outrage across the country is the deployment of data-driven policing technologies. Data-driven policing is a broad term used to describe law enforcement’s use of algorithmic-based programs, tools, and technology to support their policing duties. Historically, algorithms have been used by computer scientists and mathematicians as a problem-solving method; today, algorithms are embedded in nearly every aspect of daily human interaction. Algorithms allow for real-time search suggestions and guide the content seen on one’s social media feed. Data inputs drive an algorithm’s outputs. Algorithms identify data trends and utilize this learned data to train and improve their ability to automate their intended functionality. Algorithmic equations can blend data sets to present a clearer picture of existing correlations; these correlations are then utilized to inform predictions regarding possible outcomes. Data-driven policing is meant to be equipped with algorithmic technologies which allow for "objective" crime prevention and "science-based" predictions. However, the datasets used to train these algorithms present the potential for the continuance of racially disparate policing practices.

III. Engineered Racism

Data-driven policing technologies  utilize historical crime reports as their dataset; the same crime reports which have decades of documented discriminatory patterns in the surveillance, criminal profiling, pursuit and arrest of Black people and POCs. Data-driven policing algorithms conduct risk assessments of their datasets by reviewing data inputs – such as living in a crime hot spot – for correlations to inform data outputs, like committing a violent crime. Two prominent data-driven policing technologies that reveal racial inequalities are PredPol and COMPAS.

The PredPol algorithm utilizes historical crime data – by way of crime reports and offender profiles – to identify patterns in crime type, location, and time; the resulting data output provides a crime forecasting which is intended to identify a high-risk crime location  based on crime type and time of day. In 2018, amid many community complaints and criticisms of PredPol, the Office of the Inspector General of Los Angeles County conducted an official audit of all data-driven policing technologies in use by the Los Angeles Police Department (LAPD) [2] with the mission to identify and better understand any significant disparities – particularly potential racial disparities – in the data. The audit findings spoke to significant inconsistencies in the application of policing procedures and record keeping, further illustrated by identified patterns of minority overrepresentation in the frequency of stops and underrepresentation of white Americans amongst the crime report data reviewed (Office of the Inspector General 2020). PredPol’s data source may be a main source on contention, as the historical crime reports utilized are reflective of ongoing policing practices (e.g. Broken Windows policing [3], Stop-and-Frisk [4]) which have been repeatedly proven to disproportionately impact communities of color.

The Correctional Offender Management Profiling for Alternative Sanctions (COMPAS) technology has become one of the most widely adopted offender assessment algorithms in the United States. The New York Police Department is a heavy COMPAS user, having deployed the system throughout all its policing districts – excluding New York City – in 2001 to support probation programs (Angwin 2016). Developed in the 1990s by private technology company, Equivant (formerly Northpointe), COMPAS utilizes crime report to inform its algorithm. Unfortunately, due to Equivant’s status as a privately own institution, details regarding data type specifications are protected information, inaccessible to the general public. Notwithstanding, the COMPAS algorithm is purported to conduct a risk assessment to determine a defendant's recidivism rate (i.e. the likelihood of an offender to repeat criminal behavior). The resulting data outputs are used throughout the American court systems to determine critical outcomes such as, a defendant's eligibility for pretrial bail release. 

A 2016 ProPublica study conducted in Broward County, Florida, used 2013 and 2014 COMPAS risk scores for 10,000 criminal defendants to assess the validity of the defendant's recidivism scoring two years after their initial arrest. ProPublica found that white defendants were 63 percent more likely than Black defendants to be misclassified as having a low risk of recidivism for violent crimes – meaning, while the COMPAS algorithm predicted that white defendants would be less likely the Black defendants to violently re-offend, in actuality, white defendants were proven to have higher rates of violent crime recidivism than their Black counterparts (Angwin et al. 2016). Concurrently, Black defendants were misclassified as having a high recidivism rate at twice the rate of white defendants – meaning COMPAS outputs were twice as likely to inaccurately categorized Black defendants as being at high risk for repeat criminal activity, further demonstrating the presence of algorithmic and data bias (Angwin et al. 2016). A high COMPAS risk score can significantly impact the outcome of an individual’s prison sentencing. Such miscalculations serve to continue patterns of minority overpopulation within correctional facilities and the disruption of an individual’s ability to reintegrate into society. The study’s results speak loudly to an algorithmic flaw and very real dangers of automating critical decisions regarding the trajectory of an individual’s life.  

IV. Human Biased by Nature

A 2010 survey sought to estimate the perceived estimates of the likelihood for Black people to commit burglaries, illegal drug sales, and juvenile crime; white respondents overestimated Black criminal activity by 20 to 30 percent (Pickett et al. 2012). Due in part to the historical, racial, and socio-economic hierarchical structures permeating throughout the U.S., racial biases tend to be baked into the “default settings” of modern-day society. Though, some individuals are explicitly racist, today, many prejudices tend to appear through implicit actions and suggestive imagery. Implicit bias works at the subconscious level; so, while a person may consciously reject racial biases, they may appear subconsciously through stereotyping and racial associations. As it relates to policing, considerable evidence suggests explicit and implicit bias can significantly impact who ultimately gets stopped, searched and detained by law enforcement (Eberhardt 2004; Payne 2001; Stack 2018; Pickett, et al. 2012). A 2014 USA Today analysis of FBI reports for justifiable officer-involved homicides from 2005-2012 found that white police officers killed Black suspects at a rate of twice per week (Heath, et al. 2014). A 2015 report on police violence determined that 1,152 people were victims of officer-involved shootings that year. In 14 of the 60 U.S. police departments reviewed, Black people accounted for 100 percent of the officer-involved homicides in 2015; this included large police departments operating in cities such as Minneapolis, St. Louis, Boston, Philadelphia and Washington, D.C.  (Mapping Police Data 2015). 

Much of the appeal of data-driven policing stems from the assumption that algorithms can replace human decision-making with an automated, neutral solutions process that eliminates the potential for human bias. Unfortunately, there is ample evidence that policing algorithms often reflect the biases held by their human creators. At the same time, existing research is relatively mute on matters regarding linkages between algorithmic policing methods and their ability to successfully reduce and prevent criminal activity (Lum et al. 2016). As currently deployed and operationalized, data-driven policing has the potential to reinforce implicit bias and exacerbate racial inequality (Brayne 2017; Fagan et al. 2016; Lum et al. 2016; Saunders et al. 2017). There is a common misconception that computer-based technology allows for objective rationale and thought. However, all technology is both human-developed and human-informed. The likelihood of such algorithmic programs developing more racist tendencies only increases as their processing becomes more human-like (Buranyi 2017). 

Algorithmic bias has become so ingrained into society that it barely gets noticed. It was no fluke when, in 2015, a Google map search of the “[n-word] house” directed users to 1600 Pennsylvania Avenue NW (The White House), home to then-President Barack Obama. Similar instances have persisted throughout the years. In 2018, Google revised its photo tagging features to include a ban on labeling images as “chimpanzee”, “gorilla” or “monkey”; this decision was a direct response to an ongoing trend of Black people being associated with the image search results for these terms (Hern 2018). An algorithm is only as good as its data set is unbiased; while most implicit acts of discrimination on the Internet tend to be quickly addressed, questions remain regarding how they were able to occur in the first place. 

Though technology may present an opportunity for debias, as it presently exists, it cannot operate independently of some level of human influence. There is reasonable concern that police data, when input into the algorithms informing policing technologies, may encourage racial profiling and targeted police surveillance within Black and POC communities. The historical inequalities reflected in crime report data lend themselves to discriminatory predictive outputs. This action would only serve to feed the cycle of racial disparity in the criminal justice system. While data-driven policing  can revolutionize how we police and surveille in America, as it is currently practiced, it serves to further discriminatory practices and narratives that promote a societal norm for racial prejudice.

V. Conclusion

Within the last two decades, data-driven policing tools have been adopted by over 7,000 law enforcement entities and police departments across the nation (Electronic Frontier Foundation 2021). Proponents of data-driven policing believe their application allows for the most economical and tactful utilization of law enforcement resources.  However, such algorithms have demonstrated a proclivity for the reinforcement of discriminatory policing practices, which amplify the probability of denoting minority neighborhoods as “crime hotspots” and Black people as likely criminals. While data-driven policing remains in use throughout the nation, the continued public criticism of its ethics has recently resulted in some impactful changes. In April 2020, LAPD announced its discontinuance of PredPol services (other methods of data-driven policing remain in practice). That same summer, the Santa Cruz Police Department became the first U.S. city to  ban the use of all data-driven policing altogether. In a statement to the press, Justin Cummings, the Mayor of Santa Cruz, stated, “[There is] much work left to do to get bias out of police data…it doesn’t make any sense to try and use technology when the likelihood that it’s going to negatively impact communities of color is apparent (Pierce 2020).” 

As algorithmic technology, data analytics, and machine learning continue to evolve, it will become increasingly critical for policies and processes to be enacted to guard against practices of automated racism. Many questions remain to be investigated and answered: Can algorithms be trained to filter out discriminatory data through the application of debiasing techniques? Is the only way to correct data through the overall restructuring of policing behaviors and practices? The present and future of policing in America is fabricated with data-driven technology. Should law enforcement continue to operate in manners which amplify the racially biased undertones of the U.S. criminal justice system, the crime report data intended to feed data-driven policing technologies will be forever tainted with the stains of our nation’s racist founding principles.

+ Author biography

As a native Washingtonian, Dia Porter’s policy interests are urban-centric. Her current research focuses are in the realms of affordable housing, big data regulation, and transformative justice. She holds a B.A. in Communications and Business from George Mason University, receiving honors from Oxford University for her research in Gender Studies and Organizational Communications.

+ Endnotes

[1] These statistics do not account for those incarcerated persons identifying as "two or more races," "some other race alone," nor Afro-Latino (Prison Policy Initiative 2020).

[2] The data-driven policing technologies reviewed included PredPol, as well as the Operation LASER (Los Angeles Strategic Extraction and Restoration) and Suspicious Activity Reporting (SAR) programs.

[3] As described by the Center for Evidence-Based Crime Policy, “broken windows policing” “has been synonymous with zero tolerance policing, in which disorder is aggressively policed and all violators are ticketed or arrested…The most frequent indicator of broken windows policing has been misdemeanor arrests” (2020). A review of New York arrest from 2000 to 2005 found that 86 percent of those arrested self-identified as Black or Latinx (Kamalu 2018).

[4] “Stop-and-Frisk”, refers to the New York City Police Department practice of temporarily detaining, questioning, and at times searching civilians and suspects on the street for weapons and other contraband. The program was disbanded in 2003 amongst several concerns regarding racial targeting led to federal investigation and unconstitutional court ruling (Thompson 2013).

+ References

Angwin, J., Larson, J., Mattu, S., and Kirchner, L. 2016. “Machine bias.” ProPublica. https://www.propublica.org

Baumgartner, F.R., Epp, D.A., and Shoub, K. 2018. “Suspect citizens: What 20 million traffic stops tell us about policing and race.” Cambridge University Press.

Buranyi, Stephen. 2017. “Rise of the racist robots – how AI is learning all our worst impulses.” The Guardian. https://www.theguardian.com

Center for Evidence-Based Policing. 2020. “What works in policing?” Center for Evidence-Based Policing. https://www.cebcp.org/

Eberhardt, J., Goff, P., Purdie, V., and Davies, P. 2004. “Seeing black: race, crime, and visual processing.” Journal of Personality and Social Psychology 87 no. 6: 876-93.

Electronic Frontier Foundation. (2021, March 23). Atlas of surveillance. https://www.atlasofsurveillance.org

Heath, B., Hoyer, B., and Johnson, K. 2014. “Local police involved in 400 killings per year.” USA Today. August 15, 2014. https://www.usatoday.com

Hern, A. 2018. “Google's solution to accidental algorithmic racism: ban gorillas.” The Guardian. January 12, 2018. https://www.theguardian.com

Horwitz, J. and Seetharaman, D. 2020. “Facebook creates teams to study racial bias, after previously limiting such efforts.” The Wall Street Journal. July 21, 2020. https://www.wsj.com

Kamalu, N. and Onyeozili, E. 2018. “A critical analysis of the ‘broken windows’ policing in new york city and its impact: implications for the criminal justice system and the African American community. African Journal of Criminology and Justice Studies 11 no. 1: 71-94.

King, R. (2008). Disparity by geography: the war on drugs in America’s cities. The Sentencing Project. https://www.sentencingproject.org

Koper, C. S., Lum, C. & Willis, J. (2014). Realizing the potential of technology for policing. Translational Criminology no 7: 9-10, 17.

Leinfelt, F. H. (2006). Racial influences on the likelihood of police searches and search hits. Police Journal, 79(3). 238–257.

Lum, C., Koper, C.S., and Willis, J. (2016). Understanding the limits of technology’s impact on police effectiveness. Police Quarterly 20 no 2: 135-163. https://doi.org/10.1177/1098611116667279

Mapping Police Violence. (2015). 2015 police violence report. Mapping Police Violence. https://www.mappingpoliceviolence.org

Moravec, E.R. (2019, September 5). Do algorithms have a place in policing? The Atlantic. https://www.theatlantic.com

Office of Inspector General. (2020). Significant OIG reports. Office of the Inspector General Los Angeles Police Commission. https://www.oig.lacity.org/

Payne, K. (2001). Prejudice and perception: the role of automatic and controlled processes in misperceiving a weapon. Journal of Personality and Social Psychology 81 no. 2: 181-92. Pickett, J. T.; Chiricos, T.; Golden, K. M.; & Gertz, M. (2012).

Reconsidering the relationship between perceived neighborhood racial composition and whites’ perceptions of victimization risk: do racial stereotypes matter? Criminology 50 no. 1: 145–186. https://doi.org/10.1111/j.1745-9125.2011.00255.x

Pierce, J. 2020. “Was it racist? Santa Cruz bans predictive policing.” Good Times Santa Cruz, June 30, 2020. https://www.goodtimes.sc/

PredPol. 2020. “Predictive policing technology.” PredPol. March 24, 2021. https://www.predpol.com

President’s Commission on Law Enforcement and Administration of Justice. 1967. The challenge of crime in a free society. Washington, DC: United States Government Printing Office.

Rosich, K. J. 2007. Race, ethnicity, and the criminal justice system. United States: American Sociological Association. https://www.asanet.org

Sakala, L. 2014. Breaking down mass incarceration in the 2010 census: State-by-state incarceration rates by race/ethnicity. United States: Prison Policy Initiative. https://www.prisonpolicy.org/

Sawyer, W., and Wagner, P. 2020. Mass incarceration: The whole pie 2020. United States: Prison Policy Initiative. https://www.prisonpolicy.org/

Stack, T. 2018. “Racial biases within stop and frisk: the product of inherently flawed judicial precedent.” Ramapo Journal of Law & Society. https://www.ramapo.edu

The Sentencing Project. 2018. Report to the United Nations on racial disparities in the U.S. criminal justice system. United States: The Sentencing Project. https://www.sentencingproject.org

Thompson, T. 2013. “NYPD’s Infamous Stop-and-Frisk Policy Found Unconstitutional.” The Leadership Conference. https://www.civilrights.org/

U.S. Census Bureau. 2020. United States Race and Ethnicity. United States: U.S. Census Bureau. https://www.data.census.gov

U.S. Department of Justice. 2019. Arrest rates by offense and race, 2019. United States: U.S. Department of Justice. https://www.ojjdp.gov/