Creating a Hello World prompt tool

Introduction

Creating a new "tool" in Octopus Copilot means you want to include additional functionality to handle another prompt case. These tools are used by the OpenAI function (or tool) calling feature.

At a high level, the workflow to execute prompts is:

Functions are exposed as tools by the LLM middleware layer
The prompt and the tool definitions are passed to the LLM
The LLM selects the tool to be called and extracts the function parameters
The tool is called with the parameters

If the tool requires the LLM to answer the prompt:
1. The context required to answer the prompt is built
2. The prompt is passed back to the LLM a second time, along with the additional context
3. The LLM response is passed to the caller
If the tool can answer the prompt directly:
1. The tool response is returned to the caller

In this exercise, we'll create a tool that directly responds to prompts like:

Hello world, from Andrew

Creating a new tool

Tool location

The first step is to navigate to the project's tools folder. This folder contains the tools that are executed when OpenAI matches a user's prompt to one of the tools in this folder.

The tools are further categorized by the types of the tools. For example, we have tools that are classified under:

CLI: These tools live in the cli folder. They are executed from the command line. See this section for an example.
Generic: These tools live in the generic folder and aren't specific to any particular context (with regards to how they are called)
GitHub Actions: These tools live in the githubactions folder. These are where most of the tools live today, as users originally interacted with the tools via the GitHub Copilot chat window in VS Code or Jetbrains Rider, etc.

Important

The CLI interface is only for testing. We do not ship a CLI version of the tool, and there is no use case for customers to use the CLI interface.

Note

You might have noticed that there is a wrapper folder too. To separate the implementation of the prompt handler from its definition, we add slim function definitions in this folder. In nearly all cases, they pass through a callback to the actual implementation and add some basic comments to help OpenAI make the right selection when choosing a tool, e.g., by providing sample prompts it should use.

The tool wrapper

First, we create the wrapper for the Hello World tool. Create a new Python file called hello_world.py in the tools/wrapper folder and include the following code:

def hello_world_wrapper(query, callback, logging):
    def hello_world(
        persons_name,
        **kwargs,
    ):
        """Answers a prompt like "Hello World!". Use this function when the query is not a question, but someone
            saying Hello World to you, optionally including their own name. Queries can look like those in the following list:
        * Hello World!
        * Hello World, from Mary!

            Args:
            persons_name: The (optional) persons name
        """

        if logging:
            logging("Enter:", "hello_world")

        for key, value in kwargs.items():
            if logging:
                logging(f"Unexpected Key: {key}", "Value: {value}")

        # This is just a passthrough to the original callback
        return callback(query, persons_name)

    return hello_world

Wrapper functions provide a clean separation between "environment" configuration (e.g., environment variables, user details, logging callback functions, etc.) and prompt arguments (the values extracted from the prompt). The wrapper function takes environment configuration as arguments. It then returns a function that takes prompt arguments. The nested function captures the environment configuration and has direct access to the prompt arguments, giving it all the information it needs. However, the nested function, which is presented to the LLM tool calling middleware, only exposes arguments expected to be populated from a prompt. This removes any ambiguity around what arguments the LLM is expected to extract from a prompt and pass to the function.

Tip

Wrappers can return multiple functions. This can be used when the same tool has many different prompts that might be used to execute it. There is a limit on the size of the comment associated with a function when used by the tool calling middleware. Multiple functions overcomes this limitation by describing the same tool multiple different ways. See how_to.py for an example.

The code provides a comment at the top, and a single parameter called persons_name. The comment provides help to the LLM (OpenAI) by giving examples of the type of prompt that could be suitable for this function. There is no consistently reliable way to build these examples in the comments, it's mostly trial and error and executing tests to ensure the right function is selected at runtime by the LLM (OpenAI).

You can add multiple parameters as necessary. Just ensure you pass them through in the callback at the end.

The function also adds basic logging to show the call is being executed and logs any unexpected arguments passed to it. This can be beneficial for debugging since AI has been known to hallucinate.

Tool for the web interface

To add the tool to the web interface (the Chrome extension) and the GitHub Copilot extension, we must create the tool implementation in the tools/githubactions folder.

Note

The code separates the wrapper functions from the implementation to support the requirements and limitations of multiple clients. For example, GitHub Copilot and OctoAI support markdown responses, while the CLI only supports plain text. This means we would have two implementations to support a prompt: one that generated markdown and one that generated plain text.

Important

For now, we are using the GitHub Copilot implementations for OctoAI. These implementations may be separated in future.

Open the tools/githubactions folder and create a file called hello_world_implementation.py.

Add the following code:

from domain.response.copilot_response import CopilotResponse
from domain.tools.debug import get_params_message


def hello_world(github_user, logging):
    def hello_world_implementation(query, persons_name):
        """Returns a response to a hello world request."""

        debug_text = get_params_message(
            github_user, True, hello_world.__name__, persons_name=persons_name
        )

        # Do any additional prompt processing here. For example, calling out to Octopus, or other APIs
        # If you need to add a callback for post-confirmation processing, you can also save any arguments needed.

        logging(
            "hello_world",
            f"""
            Persons Name: {persons_name}""",
        )
        response = ["Hello world back to you."]

        if persons_name:
            response.extend(f"Nice to meet you, {persons_name}!")

        response.extend(debug_text)

        return CopilotResponse("\n\n".join(response))

    return hello_world_implementation

Note

The tool implementation can also include confirmation prompt handlers where necessary. Looking at other examples such as cancel_task.py, you will see a cancel_task_confirm_callback_wrapper function at the top of the file. This is because best practice recommends that any mutating actions performed by the Copilot extension (originally via the Copilot chat window in VS Code etc) should first prompt the user on the action that's about to take place, and then once they confirm, then the action is performed.

Adding to available tools

Once you have your tool, we next need to add it to the available tools list and wire any callbacks. For the web interface (the Chrome extension) and the GitHub Copilot extension, this needs to be done in the copilot_request_context.py file.

Navigate to the build_form_tools function, where the collection of available tools is configured.

Add the following code to the function at the end of all the other functions, after the last FunctionDefinition:

FunctionDefinition(
    hello_world_wrapper(
        query,
        callback=hello_world(get_github_user_from_form(req), log_query),
        logging=log_query,
    )
),

Important

Ensure the import statement is included in the copilot_request_context.py file e.g.
- from domain.tools.wrapper.hello_world import hello_world_wrapper
Ensure the fallback and invalid parameters are after your inserted function.

Testing the new tool

Tests are located under the tests folder in the project. There are several types:

Application tests: Broadly speaking, end-to-end tests exercise the main prompt, with some tests interacting with a real Octopus instance (spun up using docker-compose).
Domain tests: These are similar to unit tests. They typically don't require an Octopus instance to run.
Experiments tests: These don't run as part of the main build and are used to experiment with the LLM (OpenAI).
Infrastructure tests: These are end-to-end / integration tests that test things like interactions with the Octopus API or OpenAI.

Tip

The backend is structured around the high-level pattern described in Design a DDD-oriented microservice which defines what logic the Application, Domain, and Infrastructure layers should contain.

Adding domain tests

For our hello world tool, we could add a domain test to determine whether some domain logic was working correctly. For example, if we decided to create a new domain function to return a response based on whether a person's name was present in the prompt, we could test that with different inputs.

Let's change the hello_world_implementation.py to the following:

from domain.response.copilot_response import CopilotResponse
from domain.tools.debug import get_params_message


def build_hello_world_response(persons_name):
    response = "Hello world back to you."

    if persons_name:
        response += f"It's nice to meet you, {persons_name}!"

    return response


def hello_world(github_user, logging):
    def hello_world_implementation(query, persons_name):
        """Returns a response to a hello world request."""

        debug_text = get_params_message(
            github_user, True, hello_world.__name__, persons_name=persons_name
        )

        # Do any additional prompt processing here. For example, calling out to Octopus, or other APIs
        # If you need to add a callback for post-confirmation processing, you can also save any arguments needed.

        logging(
            "hello_world",
            f"""
            Persons Name: {persons_name}""",
        )
        response = [build_hello_world_response(persons_name)]

        response.extend(debug_text)

        return CopilotResponse("\n\n".join(response))

    return hello_world_implementation

This has added a function called build_hello_world_response that we'll add a test for this new function:

Add a new Python file called test_hello_world_response.py in the tests/domain folder with the following code:

import unittest

from domain.tools.githubactions.hello_world_implementation import (
    build_hello_world_response,
)


class HelloWorldResponseTest(unittest.TestCase):
    def test_build_hello_world_response_no_person(self):
        response_text = build_hello_world_response(None)
        self.assertTrue(
            "Hello world back to you." in response_text,
            "Response was " + response_text,
        )
        self.assertFalse(
            "It's nice to meet you" in response_text,
            "Response was " + response_text,
        )

    def test_build_hello_world_response_with_person(self):
        response_text = build_hello_world_response("Andrew")
        self.assertTrue(
            "It's nice to meet you, Andrew" in response_text,
            "Response was " + response_text,
        )

You can run the tests by clicking on the top-green play button:

You should also be able to see your tests passing 🎉

Adding end-to-end tests

Testing functions in isolation is good, but exercising the function to ensure the LLM (OpenAI) actually chooses the tool you have created is probably the most important test you should write. To do this, we add tests in the tests/application folder.

Tip

Application layer tests (or end-to-end tests) create a real Octopus instance using Testcontainers, populate the Octopus instance using the Octopus Terraform provider, and process prompts using the Azure OpenAI platform. This provides a high degree of confidence that your prompts work as expected.

Tests are run in parallel to optimize efficiency as part of the automated build. You could add a test to an existing test file or create a new one, which is what we'll do for this tool.

Tip

Before running/debugging any end-to-end tests, ensure you are running octoterra, octolint, and the azure-storage azurite container.

In the tests/application folder, add a file called test_copilot_hello_world.py and add the following code:

import unittest

from openai import RateLimitError
from retry import retry

from domain.transformers.sse_transformers import convert_from_sse_response
from function_app import copilot_handler_internal
from tests.application.test_copilot_chat import (
    build_no_octopus_request,
)


class CopilotHelloWorldChatTest(unittest.TestCase):
    """
    Tests that do not rely on an Octopus instance.
    """

    @retry((AssertionError, RateLimitError), tries=3, delay=2)
    def test_hello_world_no_person(self):
        prompt = "Hello world!"
        response = copilot_handler_internal(build_no_octopus_request(prompt))
        response_text = convert_from_sse_response(response.get_body().decode("utf8"))

        self.assertIn(
            "hello world back to you",
            response_text.casefold(),
            "Response was " + response_text,
        )
        self.assertNotIn(
            "nice to meet you",
            response_text.casefold(),
            "Response was " + response_text,
        )

    @retry((AssertionError, RateLimitError), tries=3, delay=2)
    def test_hello_world_with_person(self):
        prompt = "Hello world, from Barry!"
        response = copilot_handler_internal(build_no_octopus_request(prompt))
        response_text = convert_from_sse_response(response.get_body().decode("utf8"))

        self.assertIn(
            "it's nice to meet you, barry",
            response_text.casefold(),
            "Response was " + response_text,
        )

You can debug the code in PyCharm and inspect the parameters in your code, e.g. person_name to see the extracted value that OpenAI has passed to the hello_world_implementation function:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Creating a Hello World prompt tool

Introduction

Creating a new tool

Tool location

The tool wrapper

Tool for the web interface

Adding to available tools

Testing the new tool

Adding domain tests

Adding end-to-end tests

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally