INTRODUCTION

AI technology is often billed as an answer to the physical, and mental shortcomings of the human brain, and productive capacity. We think of their processes as being wholly objective-- separate from human bias, and prejudice, without considering that machines are only able to learn from the data that we provide them. As AI powered technologies continue to permeate every industry, organization, and social structure, we are seeing the negative influence which our history of deeply encoded gender bias has had on contemporary digital innovation.

These problems, at times, seem so insurmountable, that some are even led to question whether Artificial Intelligence will actually damage our progress toward a more equitable society. The intent of this article is to show why Artificial Intelligence may suffer from the prejudice ingrained in human language, especially when it comes to evaluative software like applicant scoring programs, but also why realizing this issue and taking action could be great for women in tech.

Human language, especially written language, has always reflected the biases of historically platformed groups. In the English language, literary history has assigned feminine connotations to many negative terms like “chattering” and “bitchy”, while similar behaviors in men might be described as “gregarious” and “assertive”, respectively. In turn, our machines are teaching themselves this very same bias through Natural Language Processing. As we teach ourselves to deprogram socially ingrained prejudices within ourselves, will we be as diligent for our machines?

Of course, this is a pressing issue, but I believe AI can be one of the greatest tools for combating social inequality moving forward. However, the only way to do this is to begin mitigating the bias inherent to NLP through reevaluating the way that algorithms interact with language that might reflect unconscious or intentional bias in human speech. And the best way to do this is to begin balancing the ratio of men and women who are contributing to the development of these technologies.

Currently, only 1 in every 4 computing positions is held by a woman. And when we look even closer, we see that, among the women who are working as developers, there is a significant disparity in their seniority when compared to their male colleagues. Some might point to this as a reflection of disinterest in STEM among women, but the numbers show us that this is not true. In both 2017 and 2018, women made up roughly 40% of all coding bootcamp graduates. However, we know that a significant proportion of these graduates aren’t always able to bridge the gap from formal education to their first junior development role, and even if women are successful in launching development careers, their male colleagues, at this time, are 3.5 times more likely to be in senior level positions by the age of 35.

AI has the capacity to be one of the most integral tools in eliminating human bias. It is our responsibility to ensure that learning algorithms are not teaching themselves the same sorts of problematic ways of thinking that make objective, evaluative software so valuable to human progress. This however, is not simply a problem of addressing technical shortcomings, but an opportunity to empower femme developers who will bring not only their technical talent, but their experiences as women, to be the arbiters of how NLP is succeptable to negative gender bias. It is time for project managers, and C-level executives to take a step back and evaluate whether or not their teams are demographically balanced, and if their team’s structure is built for uplifting junior developers, where an overwhelming proportion of women find themselves perpetually stuck.

GENDER BIAS IN LANGUAGE

Human language is perhaps the most critical way that gender bias is perpetuated and reinforced within culture. Antiquated Western stereotypes about the roles of men and women inform the unconscious associations we make between words and gender. Words that reflect communal or collaborative values have become associated with women, while words reflecting industrious traits are often assigned to men.

This wouldn’t necessarily be a problem if it weren’t for the fact that deeply ingrained social imbalances between men and women, reflected in the way that different types of work, and the way we describe work, have been sexed, and subsequently valued against other types of work or traits.

This has created semi-conscious value differentiations between words that describe behaviors that society associates with femininity, and those that describe masculinity. It is a textbook example of the Whorfian Hypothesis, which states that language is a reflection of our social values.

LANGUAGE AS DATA

When thinking about how machines learn, I am reminded of an Introduction to Philosophy class I took while I was in school. I don’t know if this is a common thought exercise for college underclassmen, but one of the essay prompts asked us to make an argument for whether or not it is ethical to “kill” a computer.

Of course, I’m sure the professor would have accepted any compelling argument, but she seemed to be partial to the idea that a computer is not so unlike a human mind. Computers, like humans, receive input, reference the functions and processes that make meaning out of that input, and produce output- whether those processes are encoded by a scientist, or by our lived experiences, is maybe not as important as we might believe it to be. The difference between these encoded processes is narrowed even further by machine learning, such that AI technologies could theoretically learn in such a way that is so similar to how a human does, that its being, so to speak, could be indistinguishable from that of a human.

Artificial technology is an often misunderstood science. We aren’t creating vacuumous robots that are “born” with some apriori ability to objectively analyze data points. Not unlike a human brain, a machine must also learn through observing the data available to it. So the problem of AI technologies internalizing the same prejudices that permeate society is a completely realistic, and observable, phenomenon.

When we discuss gender bias in AI, we are often referring to a problem that arises within a class of artificial intelligence known as Natural Language Processing (NLP). NLP is a subsection of AI technology that deals with the extraction and analysis of data from unstructured human language. Some might argue that computers can’t “understand” what words mean with quite the same subjectivity and nuance as a human being, but the reality is that what computers are able to do with language is not far off from what we do. They can extract values from words through a slew of different identifiers and context clues, including, but not limited to grammatical, syntactical, and lexical features, as well as the complex contexts and connotations implied by the relationship between words as they appear in human writing or speech. They then use these words to form context analyses about the data in question, and like us, come to conclusions about that input.

HOW LANGUAGE DATA HURTS WOMEN

Since 2014, Amazon.com had been building an artificial intelligence powered recruitment software to help them quickly find the best talent. The system worked by reviewing resumes for specific keywords informed by over ten years of the company’s hiring data, and ranking those resumes based on their similarity to past hires.

It did not take long for Amazon to realize that this algorithm penalized resumes submitted by women. In fact, those who worked on the project reported that resumes of applicants from women’s colleges, or those whose resumes even contained the word “women” were given less preference by the software. This, of course, is because the overwhelming majority of Amazon’s technical workforce is male, with 2017 stats showing that women make up only 40% of its total workforce. Amazon attempted to mitigate this problem by neutralizing terms that denote demographic information, but discluding select words cannot address the issue of gender encoding within all forms of language. Recognizing the inability to account for all the manners in which such technology could possibly discriminate against certain groups by assigning different levels of value to words with possibly prejudicial cultural imprints, the company eventually discarded the software in early 2017.

THE FUTURE (OF AI) IS FEMALE

There is no easy way to address the problem of gender bias in NLP based Artificial Intelligence. If there were, I think it’s pretty safe to say that some of the world’s largest tech based companies, if only for the sake of public perception (though I would like to have more faith in humanity), would have already implemented these fixes. The problem is much deeper. We need to look at our entire labor culture and ask ourselves, why is it that over half of university graduates are women and yet only 5% of the S&P 500 CEOs are women? Women are pursuing STEM education and are finding themselves pursuing increasingly more diverse professional areas. Yet, by looking at just the tech industry- the very industry that could create software that would mitigate hiring bias with AI- women are not being placed in roles at a rate that reflects the pipeline created by bootcamp education, and when they are, they are not rising in those roles like their male colleagues.

Achieving a solution will not come overnight, but it is not an impossible feat either. Currently, we know that NLP technology needs to be able to draw data from unstructured language without giving value to biases that, through the course of human history, have been deeply encoded onto the language of which the data, referenced by such technologies, is comprised.

The technical solution to this has yet to be discovered. However, it is crucial that, in the pursuit of developing these technologies, women play a commanding role. In many ways, driving the direction of our digital tools is as much a social science as it is engineering. We need to internalize the value that lived experience brings to conversations about mitigating bias through technology. To approach redesigning applicant scoring and recruitment software without creating more inclusive workplace environments for women in tech is placing the cart in front of the horse. Too many C-level leaders focus on achieving diversity metrics rather than fostering inclusive environments. If we start by supporting the success of women in tech, who are historically disadvantaged in this industry, we can begin to create the environments where technological solutions can be born of deeper appreciation for the ways that our past traditions inform the language that we use, the ideas that we propagate, and the workplaces we build.

SO WHERE DO WE START

These bias mitigation technologies will be amazing tools for our children, grandchildren, and all of the talented women who will be entering the workforce over the next decades and beyond. But as of now, NLP software is not developed enough to significantly prevent the same highly insidious biases to which non-augmented hiring processes tend to fall victim.

The truth is that these technologies are not the answer to creating a more equitable tech space. They can be a great tool, but we need to work harder to help women overcome the obstacles that prevent them from accessing necessary educational and work opportunities, and make this industry into a space to which women want to contribute, in which they want to be, and where they feel that not only their talent, but their unique experiences are valued.

It will require companies to contribute more of their time, energy, and resources into making their businesses places where women feel they can receive positive, constructive mentorship, and meaningful routes for advancement. Despite being a web development consultancy, This Dot Labs is doing all that it can to begin giving back to the women who make this industry so great by creating avenues through which companies can invest in uplifting developers from historically underrepresented demographic. As of Summer of 2019, we have launched our Open Source Apprenticeship Program, partnering with several wonderful companies, who recognize the value of connecting talented women with paid opportunities to contribute to their open-source projects.

This solution, however, will require the work of our entire industry to internalize the belief that when our industry is more equitable, our technologies will become more equitable. We owe it to ourselves, to the future of this industry, and to all the millions of people who are using and will use NLP and other AI technologies to enrich their lives.

This Dot Inc. is a consulting company which contains two branches : the media stream and labs stream. This Dot Media is the portion responsible for keeping developers up to date with advancements in the web platform. In order to inform authors of new releases or changes made to frameworks/libraries, events are hosted, and videos, articles, & podcasts are published. Meanwhile, This Dot Labs provides teams with web platform expertise using methods such as mentoring and training.

The Future (of AI) is Female: How Hiring Bias Mitigation in NLP Can Be Great for Women Now and in the Future