To download the full whitepaper, click below.
Executive Summary
Generative AI systems like ChatGPT or Bard may present risks to your organization, especially for regulated enterprises. While these technologies present great promise, it is essential to understand the risks, where they come from, and reasonable mitigation measures.
We’ve identified three primary ‘sources’ of generative AI risks, including those stemming from the training data used for the model, the composition and outputs of the model itself, and then the context and user experience in which a model is deployed. Within each of these three risk sources, we outline some of the most prominent risks types, and some potential guardrails or mitigation efforts an organization may take to build trust in their AI system
Risk Sources
Data Risks
Risks related to the data sources, structure, and timeliness.
Risk Sources | Guardrails & Mitigation Efforts |
Copyright & IP Violations Generative AI models are trained on massive amounts of data, mostly collected through scraping the internet. It is very likely that some of the data used to create these models falls under copyright or other intellectual property rights. The current legal status of the ‘fair use’ doctrine for this internet data has not yet been settled and may differ across jurisdictions. Using generative AI to create content for certain kinds of purposes could open up your organization to legal liability for copyright infringement. |
|
Training Data Timeliness Generative models take a large amount of time to train as they require massive amounts of data and can have billions of parameters. Their training data also needs to be properly cleaned and structured before it can be trained on. Because of this, the training datasets for some models may not be ‘current’. They could be years, weeks, or days out of date. Some models try to add recent data on a regular basis through some newer techniques to retune the models. The time delay could have a significant impact for certain kinds of tasks, for example when analyzing recent news content. |
|
Domain Knowledge Many generative AI systems were trained off of easily accessible data from the internet. However specific domains have always protected their content behind paywalls, or through anti-scraping techniques. Many off the shelf generative models, such as GPT-4, may not have deep domain knowledge on areas such as finance or healthcare. Their recommendations in those areas are therefore likely not based on the highest quality information available in those fields, which can pose a risk for certain use cases. Some models can be ‘fine tuned’ with such specialized data to adapt them for those domains. |
|
Societal & Historical Biases The training datasets used for generative AI reflect the existing societal and historical biases from our society. For example, historically (and currently), most corporate CEOs are men, and so many generative models will create an image/text of a man when asked about a CEO most of the time. Use of these models could perpetuate stereotypes or biases and make assumptions about the world that may not support your organization’s brand or values. |
|
Language Support Many Large Language Models are trained on text across multiple languages. This allows the models to have some ability to understand non-english languages for both inputs and outputs. However, their performance and stability across these languages is still being researched, and languages that are underrepresented in the training dataset (which is largely the public internet) will have notably less performance. Relying on these models for non-english data should be studied carefully by native speakers. |
|
Model Risks
Risks related to the model’s parameters or prompt inputs.
Risk Sources | Guardrails & Mitigation Efforts |
Hallucination Generative AI systems function primarily by leveraging statistical associations between various features. In Large Language Models (LLM) such as GPT-4, those are the associations between words, phrases, and sentences. In image generation models such as DALL-E, that’s the association between pixels of different colors. These models do not necessarily have a full understanding of the ‘real world’ and so they are able to very convincingly make stuff up that is not true. For example, if you ask an LLM to tell you about a fictional person, it will say something but there’s a very high probability that the content it produced is not ‘true’. This hallucination problem can be more significant depending on the type of model, some of the model parameters, and for certain kinds of tasks/prompts. |
|
Input Data Privacy Some generative AI systems, especially those that run as a cloud service or API, will automatically log all inputs into the system. These inputs may then be used as additional training data for future versions of the generative model. This information is collected to detect potential misuses, as well as to improve model quality and safety, however sensitive or proprietary data could be logged. There are some open source models and cloud services that do not capture that information as they can run inside of an organization's’ network that may be preferable, although often more expensive. |
|
Prompt Hacking/Injection/Jailbreaking Many generative AI systems have added protections/safeguards to prevent malicious, manipulative, or harmful uses. These protections are extremely novel and there are already several documented ways of ‘breaking’ these protections, or otherwise compromising the model. This is often called ‘Jailbreaking’. Similarly, while your team may define a ‘prompt’ (or input) for a model, there are ways an adversarial actor can inject their own prompt and then make it appear like your system generated something inappropriate. This is called ‘Prompt Injection. |
|
Harmful Content Generative AI systems, despite all the protections now being added, can generate harmful content. Such content may contain hate speech or perverse images. While many model creators are trying to reduce this, many open source systems may not have these protections added or will not be as advanced. The content created by generative AI could cause emotional or psychological harm especially if used or targeted towards specific populations (e.g. children) |
|
Output Stability & Reliability Generative models will generate a different output each time they are run, even if the same ‘prompt’/input is used. This is a double edged sword. The unique nature of the output can be beneficial in certain situations, such as writing emails. In some use cases, the inconsistent output could be a major issue, such as when giving medical advice. The ‘stability’ of the model's output, as measured by how consistent the outputs are for the same input, should be evaluated. This can be done by running the same prompt several times and evaluating if the outputs vary drastically or even contract each other at different times. |
|
System Safeguard Interactions The protections added to generative systems to try to prevent them from generating harmful content may interfere with legitimate prompts or use cases. For example, some systems will refuse to run any prompt that contains the word ‘abortion’, even if the use case is to simply summarize a news article about it. Knowing what kinds of protections exist, and how the system responds to them is essential for understanding how the system may behave once deployed to users. |
|
User Experience & Context Risks
Risks related to lack of user knowledge or environmental concerns.
Risk Sources | Guardrails & Mitigation Efforts |
Disclosure Many users may be unaware they are interacting with a generative AI system or that some content (text, image, videos, etc) was generated by AI. This can cause confusion or lack of trust. Many current generation systems can create realistic content that may appear to be human generated content. |
|
Organizational Copyright & IP Protection The US Patent office has declared that AI generated content is not eligible for standard trade protections including patents, trademarks, and copyright. That is largely due to language in existing statute that states such protections can only be granted to natural persons. This can create a large dilemma for organizations using AI to generate content that they may want to protect from an IP perspective. |
|
Logging Requirements Current and future laws that apply to AI systems require organizations to provide restitution for any errors or harms that stem from AI use. In order to provide such restitution, it is essential to keep enough information about a system’s users and their system inputs to be able to identify potentially harmed parties. This, however, may come with its own privacy challenges that need to be addressed. |
|
Compute Time Energy & Costs Generative AI models require a lot of computational energy to train and generate their content. Frequent use of these models may have environmental or cost impacts. Using these models frivolously could face ethical issues given the potential environmental impacts. |
|
Chained Prompts The latest generative AI systems allow users to ‘chain’ several prompts together in order to take advantage of passing the outputs of one prompt into another. This has proven to be very powerful for complex use cases. The technology, however, is very new and almost no safety research has been done on it as of the time of writing. Using such cutting edge technology may open additional risks. |
|
Overreliance Generative AI systems can generate a lot of content that is ‘good enough’ for some purposes but not for others. Human users who do not properly understand the limitations of a generative AI system could come to overly rely on the system and no longer think critically about its outputs. This can create issues for some tasks such as code generation, where recent studies have shown that even state of the art code generation models can produce code with serious security vulnerabilities. While human oversight is a common and recommended risk mitigation mechanism, that only works if the oversight is done with a ‘critical’ eye and does not become a rubber stamp effort. |
|
User Training & Education Generative AI systems are very new and are now also very accessible. Major platforms such as Microsoft Office and Google Search now include features built off of generative AI systems. While many users have experimented with these systems, the set of best practices and risks are not well researched and hence not widely known by the majority of their users. Much like any software system, it may be appropriate to require some degree of training or education for anyone looking to use these models in a professional capacity to ensure that any best practices are being implemented. |
|
Vulnerable Populations Some populations within our society, such as children and the elderly, may be particularly vulnerable to manipulation or confusion in interacting with a generative AI system. There has been no extensive research into the interaction of these populations and the state of the art generative models, especially when used in the context of an AI chatbot. These groups may be especially vulnerable to not knowing they’re interacting with an AI system – which can bring about possible adverse effects. |
|
Generative AI holds tremendous promise for the organizations – offering opportunities for innovation, creativity, and efficiency. However, it also comes with a set of significant challenges and risks that must be carefully addressed. If you'd like to learn more about how your organization can balance innovation and risk manage, contact us below.
Comments