top of page

Generative AI Risks & Considerations – Whitepaper



To download the full whitepaper, click below.

Trustible White Paper – Generative AI Risks & Considerations
.pdf
Download PDF • 593KB


Executive Summary


Generative AI systems like ChatGPT or Bard may present risks to your organization, especially for regulated enterprises. While these technologies present great promise, it is essential to understand the risks, where they come from, and reasonable mitigation measures.


We’ve identified three primary ‘sources’ of generative AI risks, including those stemming from the training data used for the model, the composition and outputs of the model itself, and then the context and user experience in which a model is deployed. Within each of these three risk sources, we outline some of the most prominent risks types, and some potential guardrails or mitigation efforts an organization may take to build trust in their AI system


Risk Sources

Data Risks

Risks related to the data sources, structure, and timeliness.

Risk Sources

Guardrails & Mitigation Efforts

Copyright & IP Violations

Generative AI models are trained on massive amounts of data, mostly collected through scraping the internet. It is very likely that some of the data used to create these models falls under copyright or other intellectual property rights. The current legal status of the ‘fair use’ doctrine for this internet data has not yet been settled and may differ across jurisdictions. Using generative AI to create content for certain kinds of purposes could open up your organization to legal liability for copyright infringement.

  • Do not use a prompt that generates content ‘In the style of <person>’

  • Check whether the content generated by generative AI will be directly linked to a revenue source where ‘damages’ claim could be triggered

  • Examine whether the likely alternative to using generative AI would have been to hire an independent content creator (writer, artist, etc), or if the likely alternative would have been to simply not produce the content

Training Data Timeliness

Generative models take a large amount of time to train as they require massive amounts of data and can have billions of parameters. Their training data also needs to be properly cleaned and structured before it can be trained on. Because of this, the training datasets for some models may not be ‘current’. They could be years, weeks, or days out of date. Some models try to add recent data on a regular basis through some newer techniques to retune the models. The time delay could have a significant impact for certain kinds of tasks, for example when analyzing recent news content.


  • Identify the time delay of the model’s training data and whether it’s likely to have an impact towards your intended use case.

Domain Knowledge

Many generative AI systems were trained off of easily accessible data from the internet. However specific domains have always protected their content behind paywalls, or through anti-scraping techniques. Many off the shelf generative models, such as GPT-4, may not have deep domain knowledge on areas such as finance or healthcare. Their recommendations in those areas are therefore likely not based on the highest quality information available in those fields, which can pose a risk for certain use cases. Some models can be ‘fine tuned’ with such specialized data to adapt them for those domains.


  • Don’t assume the models have ‘specialized’ information about niche domains.

  • Research & identify alternative generative AI products that may have domain expertise, and follow all of the relevant safety & risk management guidelines.

Societal & Historical Biases

The training datasets used for generative AI reflect the existing societal and historical biases from our society. For example, historically (and currently), most corporate CEOs are men, and so many generative models will create an image/text of a man when asked about a CEO most of the time. Use of these models could perpetuate stereotypes or biases and make assumptions about the world that may not support your organization’s brand or values.

  • Consider if your use case might make assumptions about people based on historical representations or inappropriate stereotypes about people.

Language Support

Many Large Language Models are trained on text across multiple languages. This allows the models to have some ability to understand non-english languages for both inputs and outputs. However, their performance and stability across these languages is still being researched, and languages that are underrepresented in the training dataset (which is largely the public internet) will have notably less performance. Relying on these models for non-english data should be studied carefully by native speakers.


  • Identify native speakers and conduct validation work for any use case that involves non-english text



Model Risks

Risks related to the model’s parameters or prompt inputs.

Risk Sources

Guardrails & Mitigation Efforts

Hallucination

Generative AI systems function primarily by leveraging statistical associations between various features. In Large Language Models (LLM) such as GPT-4, those are the associations between words, phrases, and sentences. In image generation models such as DALL-E, that’s the association between pixels of different colors. These models do not necessarily have a full understanding of the ‘real world’ and so they are able to very convincingly make stuff up that is not true. For example, if you ask an LLM to tell you about a fictional person, it will say something but there’s a very high probability that the content it produced is not ‘true’. This hallucination problem can be more significant depending on the type of model, some of the model parameters, and for certain kinds of tasks/prompts.


  • Determine whether hallucination would have a major impact for the given use case (e.g. summarizing news articles -> low impact, summarizing medical records -> high impact)

  • Determine how likely hallucination is for a given prompt. This will require some human testing.

  • Disclose this risk to the system user.

Input Data Privacy

Some generative AI systems, especially those that run as a cloud service or API, will automatically log all inputs into the system. These inputs may then be used as additional training data for future versions of the generative model. This information is collected to detect potential misuses, as well as to improve model quality and safety, however sensitive or proprietary data could be logged. There are some open source models and cloud services that do not capture that information as they can run inside of an organization's’ network that may be preferable, although often more expensive.

  • Assume all prompts/inputs will be logged and cannot be deleted. Unless the system is deployed with appropriate protections (e.g. deployed inside company systems)

  • Never put personally identifiable information (PII) into an unapproved cloud AI system (e.g. ChatGPT)

  • Never put sensitive company information into an unapproved AI system (e.g. ChatGPT)

  • Examine the terms and conditions for any generative model and the related interface

Prompt Hacking/Injection/Jailbreaking

Many generative AI systems have added protections/safeguards to prevent malicious, manipulative, or harmful uses. These protections are extremely novel and there are already several documented ways of ‘breaking’ these protections, or otherwise compromising the model. This is often called ‘Jailbreaking’. Similarly, while your team may define a ‘prompt’ (or input) for a model, there are ways an adversarial actor can inject their own prompt and then make it appear like your system generated something inappropriate. This is called ‘Prompt Injection.

  • Prevent user inputted data from directly being input into a prompt

  • Implement jailbreaking/injection detection systems/checks

  • Assess the ‘worst case’ scenario for if a prompt is overtaken and the possible consequences

Harmful Content

Generative AI systems, despite all the protections now being added, can generate harmful content. Such content may contain hate speech or perverse images. While many model creators are trying to reduce this, many open source systems may not have these protections added or will not be as advanced. The content created by generative AI could cause emotional or psychological harm especially if used or targeted towards specific populations (e.g. children)

  • Generative AI inputs/prompts can include sections to tell the model not to generate harmful content

  • Identify the target users and use context for the system and whether harmful content is likely to cause emotional or psychological harm

Output Stability & Reliability

Generative models will generate a different output each time they are run, even if the same ‘prompt’/input is used. This is a double edged sword. The unique nature of the output can be beneficial in certain situations, such as writing emails. In some use cases, the inconsistent output could be a major issue, such as when giving medical advice. The ‘stability’ of the model's output, as measured by how consistent the outputs are for the same input, should be evaluated. This can be done by running the same prompt several times and evaluating if the outputs vary drastically or even contract each other at different times.

  • Conduct stability testing for prompts and have humans evaluate how consistent the outputs are

  • Identify if there are any potential legal issues with unstable outputs

System Safeguard Interactions

The protections added to generative systems to try to prevent them from generating harmful content may interfere with legitimate prompts or use cases. For example, some systems will refuse to run any prompt that contains the word ‘abortion’, even if the use case is to simply summarize a news article about it. Knowing what kinds of protections exist, and how the system responds to them is essential for understanding how the system may behave once deployed to users.

  • Identify the likelihood of users asking about ‘sensitive’ topics/content



User Experience & Context Risks

Risks related to lack of user knowledge or environmental concerns.

Risk Sources

Guardrails & Mitigation Efforts

Disclosure

Many users may be unaware they are interacting with a generative AI system or that some content (text, image, videos, etc) was generated by AI. This can cause confusion or lack of trust. Many current generation systems can create realistic content that may appear to be human generated content.

  • Always clearly disclose to a user when they’re interacting with a generative AI systems (e.g. AI Chatbot)

  • Always disclose any content that was generated with AI as well as which AI system was used or which prompt was used to produce the output

Organizational Copyright & IP Protection

The US Patent office has declared that AI generated content is not eligible for standard trade protections including patents, trademarks, and copyright. That is largely due to language in existing statute that states such protections can only be granted to natural persons. This can create a large dilemma for organizations using AI to generate content that they may want to protect from an IP perspective.

  • Core activities that require IP protections should not heavily leverage generative AI.

Logging Requirements

Current and future laws that apply to AI systems require organizations to provide restitution for any errors or harms that stem from AI use. In order to provide such restitution, it is essential to keep enough information about a system’s users and their system inputs to be able to identify potentially harmed parties. This, however, may come with its own privacy challenges that need to be addressed.

  • Log all inputs to a generative AI system including any user id/email, the model being used, and the model outputs

  • Store all logs in a data lake for at least 3 years

Compute Time Energy & Costs

Generative AI models require a lot of computational energy to train and generate their content. Frequent use of these models may have environmental or cost impacts. Using these models frivolously could face ethical issues given the potential environmental impacts.

  • Conduct a cost-benefit analysis for generative AI use that considers if generative AI is necessary compared to other lower cost options

Chained Prompts

The latest generative AI systems allow users to ‘chain’ several prompts together in order to take advantage of passing the outputs of one prompt into another. This has proven to be very powerful for complex use cases. The technology, however, is very new and almost no safety research has been done on it as of the time of writing. Using such cutting edge technology may open additional risks.

  • Identify the sequential steps in the prompt chain and evaluate the necessity of them

Overreliance

Generative AI systems can generate a lot of content that is ‘good enough’ for some purposes but not for others. Human users who do not properly understand the limitations of a generative AI system could come to overly rely on the system and no longer think critically about its outputs. This can create issues for some tasks such as code generation, where recent studies have shown that even state of the art code generation models can produce code with serious security vulnerabilities. While human oversight is a common and recommended risk mitigation mechanism, that only works if the oversight is done with a ‘critical’ eye and does not become a rubber stamp effort.

  • Set up a training program for generative AI end users to ensure they understand the limitations and can detect flawed outputs

  • Consider methods and policies to ensure human oversight is maintained

User Training & Education

Generative AI systems are very new and are now also very accessible. Major platforms such as Microsoft Office and Google Search now include features built off of generative AI systems. While many users have experimented with these systems, the set of best practices and risks are not well researched and hence not widely known by the majority of their users. Much like any software system, it may be appropriate to require some degree of training or education for anyone looking to use these models in a professional capacity to ensure that any best practices are being implemented.

  • Develop a basic training/education resource for appropriate and safe use of generative AI systems

Vulnerable Populations

Some populations within our society, such as children and the elderly, may be particularly vulnerable to manipulation or confusion in interacting with a generative AI system. There has been no extensive research into the interaction of these populations and the state of the art generative models, especially when used in the context of an AI chatbot. These groups may be especially vulnerable to not knowing they’re interacting with an AI system – which can bring about possible adverse effects.

  • Do not develop generative AI systems that deliberately target children or senior citizens


Generative AI holds tremendous promise for the organizations – offering opportunities for innovation, creativity, and efficiency. However, it also comes with a set of significant challenges and risks that must be carefully addressed. If you'd like to learn more about how your organization can balance innovation and risk manage, contact us below.

Comments


bottom of page