We are excited to participate in the 2024 Direct Mentoring Program by supporting two software engineers interested in contributing to Open Source Gov4Tech programs. This public channel is dedicated to discussing our sponsored feature: Generating project.yaml files from a list of steps .
Mentors, contributors, and community members are welcome to ask and answer questions related to this topic here.
Feel free to engage and share your insights!
opened 01:15PM - 04 Mar 24 UTC
DMP 2024
---
name: Generate a project.yaml file from a list of steps
about: OpenFn's … submission for the Code for GovTech program
title: Generate a project.yaml file from a list of steps
labels: OpenFn, Lightning, AI, Python
assignees: ''
---
OpenFn is an open source platform for data integration and workflow automation which can be used via a [CLI](https://github.com/OpenFn/kit/tree/main/packages/cli) or a [web UI](http://github.com/openfn/lightning).
Projects on OpenFn can be encoded in a yaml file containing workflows, steps, jobs, triggers and edges.
### What are these terms?
Workflow contains one trigger and one or more steps connected by edges.
Workflows are at the base of users activities and steps related to a objective are built and saved in a workflow.
A step is a task or instruction that users write mostly in Javascript. The output of a step can be anything from sending a text message, transform data, make an API call, send data to an external systems or fetch data from an external system. These jobs are performed with the help of adaptors.
Triggers are useful for nudging workflows to run based on an event or at a scheduled time. Edges connects two steps and determines the order and conditions for the steps in a workflow.
See more details at https://docs.openfn.org/documentation/get-started/terminology
### What the project.yaml file looks like:
Below is an example of a yaml file for a project `openhie-project` which has one workflow `OpenHIE-Workflow` and 4 steps `[ "FHIR-standard-Data-with-change", "Send-to-OpenHIM-to-route-to-SHR", "Send-to-OpenHIM-to-route-to-SHR", "Notify-CHW-upload-successful"]`. The trigger is a Webhook event and there are also 4 edges with `source_job`, `target_job` and `condition` variables that guide how the workflow should be executed.
```yaml
name: openhie-project
description: Some sample
# credentials:
# globals:
workflows:
OpenHIE-Workflow:
name: OpenHIE Workflow
jobs:
FHIR-standard-Data-with-change:
name: FHIR-standard-Data-with-change
adaptor: '@openfn/language-http@latest'
enabled: true
# credential:
# globals:
body: |
fn(state => {
console.log("hello world")
return state
});
Send-to-OpenHIM-to-route-to-SHR:
name: Send-to-OpenHIM-to-route-to-SHR
adaptor: '@openfn/language-http@latest'
enabled: true
# credential:
# globals:
body: |
fn(state => state);
Notify-CHW-upload-successful:
name: Notify-CHW-upload-successful
adaptor: '@openfn/language-http@latest'
enabled: true
# credential:
# globals:
body: |
fn(state => state);
Notify-CHW-upload-failed:
name: Notify-CHW-upload-failed
adaptor: '@openfn/language-http@latest'
enabled: true
# credential:
# globals:
body: |
fn(state => state);
triggers:
webhook:
type: webhook
edges:
webhook->FHIR-standard-Data-with-change:
source_trigger: webhook
target_job: FHIR-standard-Data-with-change
condition: always
FHIR-standard-Data-with-change->Send-to-OpenHIM-to-route-to-SHR:
source_job: FHIR-standard-Data-with-change
target_job: Send-to-OpenHIM-to-route-to-SHR
condition: on_job_success
Send-to-OpenHIM-to-route-to-SHR->Notify-CHW-upload-successful:
source_job: Send-to-OpenHIM-to-route-to-SHR
target_job: Notify-CHW-upload-successful
condition: on_job_success
Send-to-OpenHIM-to-route-to-SHR->Notify-CHW-upload-failed:
source_job: Send-to-OpenHIM-to-route-to-SHR
target_job: Notify-CHW-upload-failed
condition: on_job_failure
```
## Developing an AI agent that generates project yaml files:
We are looking to integrate an AI agent that is able to generate a workflow for a user, based on an instructions specifying the required steps and adaptors. This feature will enable new users of OpenFn (Lightning) to get started faster after registration by entering the steps they need.The agent will automatically build a workflow can run without error and ready for editing.
The OpenFn team is working on an Bun/Typescript based AI server that will can execute python modules without the need to deploy them to individuals servers.
Hence, the deliverable for this issue is a **python module** that is capable can be executed from a command line that would take inputs (from the user) and would returns a project.yaml file.
Note that all adaptors added to the yaml file should be given a version of `@latest` for now. This should simplify the generation task. Ie, `adaptor: @openfn/language-common@latest`
### Here's what a list of steps might look like (also pictured in the image below):
- Fetch Referrals From Primero using the primero adaptor
- Send a text message to case officer with telerivet adaptor
- Add patient to DHIS2 with the dhis2 adaptor
### When this feature is integrated with Lighnting, here is how a user would use it to generate a workflow.
Building this UI/X is NOT part of this project but the image below is helpful for contributors to understand the mission.
![Image](https://github.com/OpenFn/kit/assets/5230530/00b2b18d-dfb0-4446-a22f-29f90b6fad79)
#### Acceptance Criteria:
- [ ] This AI based on a python module should generate a project.yaml file based on a set of steps with adaptors defined by the user. (See test cases)
- [ ] Set default trigger to webhook
- [ ] When generating the projcet.yaml file, the `name` attribute can be set to `untitled-project`
- [ ] Generated project.yaml file is Openfn-readable
- [ ] All adaptors have an `@latest` version specifier
- [ ] The yaml file should be ready for import into the web platform
### Nice to haves:
- [ ] Enabling the agent to also generate job expressions for each step that can be reviewed by a human user. (See #620)
- [ ] Generate a workflow.json file which can be parsed and run in the CLI. See documentation on [CLI workflows here](https://docs.openfn.org/documentation/cli-usage#run-a-workflow)
#### Documentation and relevant links
- [Workflows](https://docs.openfn.org/documentation/build/workflows)
- [CLI workflows](https://docs.openfn.org/documentation/cli-usage#run-a-workflow)
- [Web UI](http://github.com/openfn/lightning)
- [CLI](https://github.com/OpenFn/kit/tree/main/packages/cli)
#### Test Cases:
Below are examples of prompts and their corresponding workflow.yaml for reference and testing.
Prompt 1
- Get Data from DHIS2
- Filter out children under 2
- Aggregate the data
- Make a comment on Asana
```
Workflow-1:
name: Simple workflow
jobs:
Get-data-from-DHIS2:
name: Get data from DHIS2
adaptor: '@openfn/language-dhis2@latest'
# credential:
# globals:
body: |
Filter-out-children-under-2:
name: Filter out children under 2
adaptor: '@openfn/language-common@latest'
# credential:
# globals:
body: |
Aggregate-data-based-on-gender:
name: Aggregate data based on gender
adaptor: '@openfn/language-common@latest'
# credential:
# globals:
body: |
make-a-comment-on-Asana:
name: make a comment on Asana
adaptor: '@openfn/language-asana@latest'
# credential:
# globals:
body: |
triggers:
webhook:
type: webhook
enabled: true
edges:
webhook->Get-data-from-DHIS2:
source_trigger: webhook
target_job: Get-data-from-DHIS2
condition_type: always
enabled: true
Get-data-from-DHIS2->Filter-out-children-under-2:
source_job: Get-data-from-DHIS2
target_job: Filter-out-children-under-2
condition_type: on_job_success
enabled: true
Filter-out-children-under-2->Aggregate-data-based-on-gender:
source_job: Filter-out-children-under-2
target_job: Aggregate-data-based-on-gender
condition_type: on_job_success
enabled: true
Aggregate-data-based-on-gender->make-a-comment-on-Asana:
source_job: Aggregate-data-based-on-gender
target_job: make-a-comment-on-Asana
condition_type: on_job_success
enabled: true
```
Prompt 2:
- Fetch submissions from KoboCollect with language-kobotoolbox@latest
- Push the data to the a postgresSQL database with language-postgresql@latest
- Send text message to an admin using language-twilio@0.3.4 with status of sent message
```
workflow-1:
name: another simple workflow
jobs:
Fetch-submissions-from-KoboCollect:
name: Fetch submissions from KoboCollect
adaptor: '@openfn/language-kobotoolbox@latest'
# credential:
# globals:
body: |
// Get started by adding operations from the API reference
Push-Data-to-PostgreSQL:
name: Push Data to PostgreSQL
adaptor: '@openfn/language-postgresql@latest'
# credential:
# globals:
body: |
// Get started by adding operations from the API reference
Send-a-text-message-to-admin:
name: Send a text message to admin
adaptor: '@openfn/language-twilio@0.3.4'
# credential:
# globals:
body: |
// Get started by adding operations from the API reference
triggers:
webhook:
type: webhook
enabled: true
edges:
webhook->Fetch-submissions-from-KoboCollect:
source_trigger: webhook
target_job: Fetch-submissions-from-KoboCollect
condition_type: always
enabled: true
Fetch-submissions-from-KoboCollect->Push-Data-to-PostgreSQL:
source_job: Fetch-submissions-from-KoboCollect
target_job: Push-Data-to-PostgreSQL
condition_type: on_job_success
enabled: true
Push-Data-to-PostgreSQL->Send-a-text-message-to-admin:
source_job: Push-Data-to-PostgreSQL
target_job: Send-a-text-message-to-admin
condition_type: on_job_success
enabled: true
```
### Product Name
Product Name: OpenFn
### Project Name
Project Name: Generate a project Workflow.yaml file from a list of steps with adaptors
### Organization Name:
Open Function Group
### Domain
[Others]
### Tech Skills Needed:
Javascript, AI, Python
### Mentor(s)
### Complexity
[High]
### Category
[Feature], [PoC]
### Sub Category
Pick one or more of [API], [Backend], [Artificial Intelligence].
Pin4sf
June 16, 2024, 11:53am
2
I’m Shivansh, and I’ve been selected to work on this issue for DMP 2024. As a Data Science and ML enthusiast, I enjoy working on various AI-related projects. I am grateful to the OpenFn team and DMP for giving me this opportunity to work with them. I’m really excited to learn more about DPGs and industry-leading software practices through this mentorship program.
I’ve been developing a Python module that generates YAML files from workflow steps using NLP, rule-based parsing, and NER. You can check the initial AI pipeline I am following here:
My initial YAML generator implementation can be found in this Google Colab: RB+NER_Yaml.ipynb .
I hope my work aligns with the project goals. I would greatly appreciate your feedback and guidance on any potential improvements or clarifications regarding my responsibilities.
1 Like
Pin4sf
August 9, 2024, 1:21pm
3
Hello everyone,
I have reached the midpoint of my service generation issue DMP 619 and have made significant progress in generating the project workflow (project.yaml
), which should be lightning import ready.
I have created the gen_project
service in the OpenFn Apollo repository and have been closely working with the project maintainers to resolve issues.
You can see the update in this video .
Feel free to let me know if there’s anything else you’d like to add or adjust!
2 Likes
Thanks for the update @Pin4sf . It’s good to see that everything is coming together nicely.
keep it up
#DMP2024
Pin4sf
September 21, 2024, 2:30pm
5
I have completed my issue [DMP 2024] Generate a project.yaml file from a list of steps.
You can access the contribution dashboard here C4GT DMP 24
I would like to express my gratitude to all the maintainers who helped me throughout my mentorship program. I’ve learned so much about open source development and feel much more confident in my abilities. It was a privilege to work on this issue, and I hope to contribute more to the project in the future!
I am attaching the final demo of the project Video
1 Like
ayodele
September 30, 2024, 12:18pm
6
Congrats Shivansh and thanks for your contributions.