From the course: Ethics in the Age of Generative AI

Organizing data with ethics in mind

From the course: Ethics in the Age of Generative AI

Organizing data with ethics in mind

- I remember standing at the top of the Burj Khalifa, the tallest building in the world. As I looked out on that view, a part of me was wondering just how strong the foundations must be to protect the people inside. These days, I look at generative AI models that can do amazing things and I find that same part of me evaluating the safety, trust, and design that ensures that these models will inspire and protect all of us. AI models are built on top of data. So let's talk about the importance of ethically organizing and using your institution's data. By taking an ethical approach, you'll reduce the risk to your organization and you'll increase the value of that data as an organizational asset. There are three goals in effective and ethical data organization: the first is prioritizing privacy, the second reducing bias, and the third, promoting transparency. The first consideration is prioritizing privacy. Almost every organization collects sensitive data about customers and employees; things like personal healthcare information or financial and banking details. And customers and employees trust the organization with this data so it's important to handle it sensitively and ethically. Failing to uphold this trust can expose a company to liability and reputational harm, and maybe, most importantly, erode trust with your customer. So to test your company's practices, you can lead a privacy audit. During a privacy audit, you build a comprehensive understanding of what data your organization has, how it was collected, how it's stored, and how it's administered. The results of a policy audit inform recommendations to create or adapt your existing privacy policy to protect sensitive data. With a privacy policy in place, the next step is to create a training curriculum for all employees that focuses on understanding why sensitive data must be handled securely and advises them of their responsibilities. The second goal is reducing bias in data collection and in data use. Bias in data can arise from a number of sources and understanding how it makes its way into your dataset requires genuine curiosity in your analysis. To start a bias audit, be curious about whether the data really represents the population you're trying to serve. For example, I recently worked with an organization building AI for cancer screening. And as they tried to deploy this tool, they found that early models exclusively use training data from the global north, requiring a retraining of the model to make it useful for a global population. So does your dataset represent inputs from a diversity of individuals across race, gender, age, and more? Are we asking the right questions when we collect data? You might also consider whether your data collection process was accessible to differently-abled people. And finally, once the data is collected, you might consider whether a team with relevant and diverse lived experience has an opportunity to analyze and interpret this data to reduce the risk of potential bias. Bias is especially important when we attempt to explain how our algorithms make recommendations that have real impacts on people's lives. For example, recent studies have shown that early attempts to automate hiring have propagated existing biases and employment practices. Understanding the bias in the data helps us minimize the negative impacts of bias in the algorithm. After you've completed your privacy and your bias audits, transparency is the final step in the process. You want to be able to explain to all stakeholders, your customers, your employees, your suppliers, your regulators, how data is collected and used. You might consider publishing a data governance framework or a data transparency statement to help your stakeholders understand what you do with their data. And you should also make it clear that individuals can access any data that you might have stored about them and have rights on how you might use it on an ongoing basis. Organizing and understanding your data helps you understand your customers better, ensure they're well represented, weed out biases, and builds a stronger foundation for your AI products and tools.

Contents