[DMP 2024] Automatically generate job expressions from prompts

We are excited to participate in the 2024 Direct Mentoring Program by supporting two software engineers interested in contributing to Open Source Gov4Tech programs. This public channel is dedicated to discussing our sponsored feature: ** Automatically generate job expressions from prompts**.

Mentors, contributors, and community members are welcome to ask and answer questions related to this topic here.

Feel free to engage and share your insights!

1 Like

Thank you for giving this opportunity. I am pleased to inform you that I have been selected for this project.

For this feature, I have planned the following approach:

We can leverage the existing Apollo (formerly Gen) repository. I noticed that you have set up the initial framework for making calls to Apollo services through the CLI.

To move forward, I propose creating a service for job generation using the following inputs. Users would provide these inputs via a .json file:

{
  "api_key": "apiKey",
  "adaptor": "@openfn/language-dhis2@4.0.3",
  "data": {
    "name": "bukayo saka",
    "gender": "male"
  },
  "signature": "Create a new trackedEntityInstance 'person' in dhis2 for the 'dWOAzMcK2Wt' orgUnit."
}

The CLI command openfn apollo job_expression_generator tmp/data.json -o tmp/output.json would then be used to call the job generation service on the Apollo server and return the desired result.

For job generation on the server, we can create a job_expression_generator service. This service would parse inputs from the .json file and generate the required output. Below is a sample implementation:

from util import DictObj, createLogger

from .utils import (
    generate_job_prompt,
)

from inference import inference


logger = createLogger("job_expression_generator")


class Payload(DictObj):
    api_key: str
    adaptor: str
    signature: str
    data: dict


# Generate job expression based on the input data, adaptor specification, and instructions
def main(dataDict) -> str:
    data = Payload(dataDict)
    logger.info("Running job expression generator with adaptor {}".format(data.adaptor))
    result = generate(data.adaptor_spec, data.instructions, data.sample_input, data.get("api_key"))
    logger.success("Job expression generation complete!")
    return result


def generate(adaptor_spec, instructions, sample_input, key) -> str:
    prompt = generate_job_prompt(adaptor_spec, instructions, sample_input)

    result = inference.generate("gpt3_turbo", prompt, {"key": key})

    return result

The prompt for this might look like:

prompts = {
    "job_expression": (
        "You are a helpful Javascript code assistant.",
        "Below is a description of a task along with the adaptor specification and sample input data. "
        "Generate a JavaScript job expression that performs the task described. Ensure the job expression "
        "follows the conventions defined in the adaptor documentation.\n\n"
        "Adaptor: {adaptor}\n"
        "Instructions: {signature}\n"
        "Sample Input: {sample_input}\n"
        "====",
    ),
}

For testing, we can run this with sample inputs from the CLI, write tests in the Apollo repo itself, or both.

I believe this approach aligns with what you’re looking for. Could you please provide feedback on whether I am on the right track or suggest any improvements? Your guidance would be greatly appreciated.