Skip to content

Prompt Engineering with Octopus

Matthew Casperson edited this page Apr 7, 2024 · 22 revisions

The Octopus AI interface uses a Large Language Model (LLM) to answer plain English questions about an Octopus space. LLMs are rapidly evolving and improving their ability to understand the relationships between complex object, like those that make up an Octopus space.

However, today, the ability to extract useful information when querying an LLM is heavily influenced by the structure of the question. Building questions that generate the desired result is called Prompt Engineering.

Limitations of Octopus AI

Octopus AI is a read only query interface. You can not initiate deployments, delete variables, create project, or modify the space in anyway.

It is most useful for understanding the state of your space with plain English queries that would otherwise require a complex script to query the Octopus API. However, you must verify the answers provided by Octopus AI - see the section "Verifying the answers" for more details.

Including resources in the context

The most important factor when querying Octopus AI is to include the correct keywords to identify the kinds of resources the question relates to. LLMs have a context limit, which means they can only be supplied a small subset of the configuration that makes up an Octopus space. The subset of resources included in the context is defined by a number of keywords you must include in a query.

The table below lists the keywords recognized by Octopus AI, the resources the keywords relate to, and an example query demonstrating how the keyword is used.

Resource Keyword Good Example Bad Example Notes
Deployment Projects project What does the "Deploy Web App" project do? What does "Deploy Web App" do?
Deployment targets or agents target, machine, or agent What URL is the "K8s" machine configured with? What URL does "K8s" have? Prefer to use the word machine over target as target can sometimes trigger the LLM content filters.
Runbooks runbook Does the runbook "restart web server" in the project "web app" include a bash script? Does "restart web server" include a bash script? Include the project that the runbook belongs to.
Tenants tenant What is the description of the "US-EAST-1" tenant? What is the description of "US-EAST-1"?
Library Variable Sets library variable set What variables are included in the library variable set "AWS"? What variables are included in "AWS"?
Environments environment Does the "Production" environment allow dynamic infrastructure? Does "Production" allow dynamic infrastructure?
Feeds feed What URL does the "Docker" feed use? What URL does "Docker" use?
Accounts account What is the access key for the "AWS Production" account? What is the access key for "AWS Production"?
Certificates certificate What is the description of the "Test Web Server" certificate? What is the description of "Test Web Server"?
Lifecycles lifecycle What environments are part of the "Application" lifecycle? What environments are part of "Application"
Worker Pools worker pool What is the description of the "Azure" worker pool? What is the description of "Azure"?
Machine Policies policy What does the health check script for the "K8s Worker" machine policy do? What does the health check script for "K8s Worker" do?
Tenant Tag Sets tag What tags are assigned to the tenant "Store 1"? What groups does the tenant "Store 1" belong to?
Project Groups group What projects belong to the "Default" project group? What belongs to "Default"?
Deployment Process or Runbook steps step What does the "Deploy to Azure" step in the "Web App" project do? What does "Deploy to Azure" do?
Variables variable Which steps in the "Web App" project is the "DatabasePassord" variable used in? Where is "DatabasePassord" used?
Releases release What is the latest release for the "Web App" project? What is the latest version of the "Web App" project?

Verifying the answers

LLMs are non-deterministic. This means that you will often get a different result when asking the same question multiple times. More importantly, it also means LLMs do not always provide the correct answer, even when asked the same question that was previously answered correctly.

The currently available LLMs are also often unable to answer questions that require processing complex relationships. For example, the following question will often confuse today's LLMs:

Bob is in the living room.
He walks to the kitchen, carrying a cup.
He puts a ball in the cup and carries the cup to the bedroom.
He turns the cup upside down, then walks to the garden.
He puts the cup down in the garden, then walks to the garage.
Where is the ball?

In addition, the limited context that can be passed to an LLM means that it may not have all the information required to answer the question correctly.

This means you must always confirm the answers provided by Octopus AI before making any decisions.

In practice, this means Octopus AI can be useful for indicating the presence of something, such as an error in a deployment log, that can then be independently verified. Octopus AI is less useful for indicating the absence of something, such as a report of a dependency vulnerability, as verifying the absence would require inspecting all the relevant data manually, negating the point of using Octopus AI in the first place.

Small changes can have a big impact

Making small changes to your query can have a big impact on the answer. For example, this query has failed in the past to find the list of URLs in a deployment log:

Find any URLs contained in the deployment logs for the "Octopus Copilot Function" project in the "Production" environment.

The following query replaces the word find with extract, and generates the desired answer:

Extract any URLs contained in the deployment logs for the "Octopus Copilot Function" project in the "Production" environment.

It is not obvious why such a small change can be the difference between getting the answer you need or not. The opaque nature of LLMs is a well known challenge.

So if you do not get the results you expect, try altering your query slightly to see if that affects the results.

Ask "what" the answer is

Prefer asking questions like:

What is the latest deployment of the project "Project Name" in the "Production" environment?

Asking what the answer is will more likely get the actual result.

Avoid statements like these:

Find the latest deployment of the project "Project Name" in the "Production" environment.
List the latest deployment of the project "Project Name" in the "Production" environment.
Display the latest deployment of the project "Project Name" in the "Production" environment.
Print the latest deployment of the project "Project Name" in the "Production" environment.

They often result in the LLM providing instructions on how to find the information rather than providing the answer.

Give the LLM permission to not provide an answer

LLMs really want to give you an answer, even if the correct answer is that there is no answer or there are no matches to your question.

For example, this question attempts to find any manual intervention steps in a project that does not have any:

What are the steps in the "EKS Octopub Audits" project with a step type of "Octopus.Manual"? Show the step name and type in a markdown table.

The LLM with helpfully (and incorrectly) report that every step is of type Octopus.Manual and then clearly list every step with a step type that is not of type Octopus.Manual:

Here is the list of steps in the "EKS Octopub Audits" project with a step type of "Octopus.Manual":

Step Name	        Step Type
Generate Variables	Octopus.Script
Smoke Test	        Octopus.KubernetesRunScript
Security Scan	        Octopus.Script
Feedback	        Octopus.Script

By giving the LLM permission to report if there are no matching steps, we get a more accurate answer:

What are the steps in the "EKS Octopub Audits" project with a step type of "Octopus.Manual"? If you can not find any matching steps, say so in the answer. Show the step name and type in a markdown table.

This is the result:

The project "EKS Octopub Audits" has the following steps with a step type of "Octopus.Manual":

Step Name	        Step Type
Generate Variables	Octopus.Script
Smoke Test	        Octopus.KubernetesRunScript
Security Scan	        Octopus.Script
Feedback	        Octopus.Script

Note: There are no steps with a step type of "Octopus.Manual" in the "EKS Octopub Audits" project.

Use timezone aware date formats

The LLM doesn't know what your local timezone is, so when entering dates, use a timezone aware format like ISO 8601.

The LLM does not know to convert times into a local timezone, so you can ask for and output values to be converted for you.

Your answer will be displayed as a markdown table. 
The status "success" is represented with the 🟢 character. 
The status "In Progress" is represented by the 🔵 character.
Other statuses are represented with the 🔴 character. 
Find the deployments created after 2024-04-07T00:00:00+10:00 and before 2024-04-08T00:00:00+10:00.
List the release version, created time (in the Brisbane timezone), release notes, deployer, and status.

Responses that mention HCL or Terraform

The context passed to the LLM is a Terraform representation of the resources in a space. HCL is the language used to define Terraform resources. This is why the LLM may reference Terraform or HCL in the answer even though the question did not mention them, because it was passed your original question and a large blob of HCL defining the state of the space as the context from which to answer the question.

Clone this wiki locally