Is there an UpsertIf function for MySQL just like there is one for PostgreSQL and Salesforce? It would really make upserting data easier if this existed
@maathe Sorry we don’t support that. We’re considering adding fnIf to common. What sort of conditions are you writing ?
Ideally you can check if you need to run a certain step from an edge condition
Eg: If you have data state.data.name
then run your operation
upsert("table", { name: "maathe" })
I am working with a CommCare to MySQL integration and the only function available for the MYSQL connector is upsertMany unlike PostgreSQL and Salesforce that have upsertIf. I am working with multiple commcare forms whose data is going to corresponding tables in a MySQL database. upsertIf would make this seamless and eliminate the need to write multiple if conditions in each each function.
Hi @maathe
We hear you and we’d love to help.
What sort of conditions are you writing for your upserts? Would you mind sharing a code block?
The conditions I am writing filter out the name of the survey as in the code below. The job I created works well and data is transferred from commcare to the db successfully.
each(
fn((substate) => {
var createdObjects = [];
if(substate.data.form.survey_type == "Partner Registration"){
const enumerator = substate.data.form.enumerator;
var data_collection_date ="";
if(substate.data.form.data_collection_date ==""){
data_collection_date = '2024-01-01';
}
Thanks @maathe!
Don’t worry, I’m sure your code works great I’m just trying to understand your usage so I can suggest (or build) a solution for you.
My problem is that upsertIf
isn’t scalable. We can’t add -If
to every operation, eg createIf/queryIf/getIf/postIf
. In fact, I want to REMOVE upsertIf
!
I can’t see the actual upsert in your code - but how would upsertIf
help you? You’d still need to sort your data, write the condition, prepare the data, and then call out to upsert.
I think you’re doing something like this now:
each(
fn((substate) => {
if(substate.data.form.survey_type == "Partner Registration"){
const data = []
// prepare partner registration survey
upsert(data)(state)
}
})
And I expect you have a different if
block for each survey type.
How would you expect that to look if you had upsertIf
?
Apologies for not sharing the entire code block. It looks like so;
each(
fn((substate) => {
var createdObjects = [];
if(substate.data.form.survey_type == "Partner Registration"){
const enumerator = substate.data.form.enumerator;
var data_collection_date ="";
if(substate.data.form.data_collection_date ==""){
data_collection_date = '2024-01-01';
}
else{
data_collection_date = substate.data.form.data_collection_date;
}
const partnership_start_date = state.data.form.partnership_start_date;
const partner_name = state.data.form.partner_name;
const establishment_date = state.data.form.establishment_date;
const registration_number = state.data.form.registration_number;
var sector_portfolio_raw = state.data.form.sector_portfolio;
sector_portfolio_raw = sector_portfolio_raw.replace(" ", ", ");
const other_portfolio = state.data.form.specify_sector_portfolio;
const sector_portfolio = sector_portfolio_raw.replace("other", other_portfolio);
var business_type_raw = state.data.form.business_type;
business_type_raw = business_type_raw.replace(" ", ", ");
const other_business = state.data.form.specify_business_type;
const business_type = business_type_raw.replace("other", other_business);
const legal_rep = state.data.form.legal_rep;
const legal_rep_sex = state.data.form.legal_rep_sex;
const partner_district = state.data.form.partner_district;
const partner_sector = state.data.form.partner_sector;
const gps_location = state.data.form.gps_location;
const email = state.data.form.email;
const contact = state.data.form.contact;
const provinces_of_operation = state.data.form.provinces_of_operation;
const districts_of_operation = state.data.form.districts_of_operation;
const partner_id = state.data.form.partner_id;
var createdObject = {
enumerator : enumerator,
data_collection_date: data_collection_date,
partnership_start_date: partnership_start_date,
partner_name: partner_name,
establishment_date: establishment_date,
registration_number: registration_number,
sector_portfolio: sector_portfolio,
business_type: business_type,
legal_rep: legal_rep,
legal_rep_sex: legal_rep_sex,
partner_district: partner_district,
partner_sector: partner_sector,
gps_location: gps_location,
email: email,
contact: contact,
provinces_of_operation: provinces_of_operation,
districts_of_operation: districts_of_operation,
partner_id: partner_id
};
createdObjects.push(createdObject);
}
console.log(createdObjects);
return createdObjects;
}),
upsertMany(
‘1_partner_register’, // the DB table
[
{enumerator: dataValue(“enumerator”),
data_collection_date: dataValue(“data_collection_date”),
partnership_start_date: dataValue(“partnership_start_date”),
partner_name: dataValue(“partner_name”),
establishment_date: dataValue(“establishment_date”),
registration_number: dataValue(“registration_number”),
sector_portfolio: dataValue(“sector_portfolio”),
business_type: dataValue(“business_type”),
legal_rep: dataValue(“legal_rep”),
legal_rep_sex: dataValue(“legal_rep_sex”),
partner_district: dataValue(“partner_district”),
partner_sector: dataValue(“partner_sector”),
gps_location: dataValue(“gps_location”),
email: dataValue(“email”),
contact: dataValue(“contact”),
provinces_of_operation: dataValue(“provinces_of_operation”),
districts_of_operation: dataValue(“districts_of_operation”),
partner_id: dataValue(“partner_id”)
}
]
)
);
My goal is to create an each function for every table in my database using one job since I have multiple forms within one application.
Hi @maathe,
Thank you for the extra code! I have an idea about a pattern that might be a bit nicer for you.
But first, I don’t think your code is working the way you think it is. I didn’t mean to get in that that but I really think we need to address it!
You are passing two functions into each
. The first argument to each
is called once with state (your root state), and the second function is called multiple times with substate
.
Usually, each
is called with a string and a function, where the string is a path to some array in state, and the function is invoked for every item in that array.
So I think you’ll find your code only ever runs for a single item. And I can kind of see that in your use of upsertMany.
Let me try and draft up a solution that I think will work a bit better for you
Hi @jclark,
That would be super helpful. Thank you. I have tested the same code and managed to push to two separate tables using two each functions within one job and it has worked although it initially timed out but the data later pushed to the db. I am not very good with javascript so most of my attempts are using brute force means.
Hi again @maathe
Here is how I would approach the problem.
I hope the code/comments explain themselves but please do ask questions!
// Assuming state.data is an array of forms...
// first, initialise some state which we'll use later
fn((state) => {
// This is a "bucket" of data for each form type
state.buckets = {};
// This is a "bucket" of transformed data ready to be upserted
state.upsertData = {};
return state;
});
// Now filter your data into the buckets based on the survey type
each('$.data[*]', (state) => {
if (!state.buckets[state.data.form.survey_type]) {
// Initialise the bucket for this data type if it doesn't exist
state.buckets[state.data.form.survey_type] = [];
state.upsertData[state.data.form.survey_type] = [];
}
// add the data to the bucket
state.buckets[state.data.form.survey_type].push(state.data);
return state;
});
// Now transform all your partner registration forms
each(
(state) => state.buckets['Partner Registration'],
(substate) => {
const createdObject = {
enumerator: substate.data.form.enumerator,
// add all your mappings in here
};
// add the created data to the second bucket
substate.upsertData['Partner Registration'].push(createdObject);
return substate;
}
);
// Now upsert your partner registration forms
upsertMany(
'1_partner_register',
(state) => state.upsertData['Partner Registration']
);
You’ll have to repeat the transform and upsert steps for each form type.
It should be save I think to call upsert with empty data. I’m not experienced in the mySQL adaptor so you might have to play with this if you start getting errors (please let us know because we can fix any errors you encounter!)
If the job gets too big and hard to manage, consider breaking this up into several workflow steps! Create your “buckets” in one step, and then have a child step for each form type.
Thanks a lot for this @jclark.
I share feedback tomorrow after I have done tests.
Hi @jclark,
I have tried using your solution but it has the following error: TypeError [Error]: Cannot read properties of undefined (reading ‘survey_type’).
This is the code I am using;
// Assuming state.data is an array of forms…
// first, initialise some state which we’ll use later
fn((state) => {
// This is a “bucket” of data for each form type
state.buckets = {};
// This is a "bucket" of transformed data ready to be upserted
state.upsertData = {};
return state;
});
// Now filter your data into the buckets based on the survey type
each(‘$.data[*]’, (state) => {
if (!state.buckets[state.data.form.survey_type]) {
// Initialise the bucket for this data type if it doesn’t exist
state.buckets[state.data.form.survey_type] = ;
state.upsertData[state.data.form.survey_type] = ;
}
// add the data to the bucket
state.buckets[state.data.form.survey_type].push(state.data);
return state;
});
// Now transform all your partner registration forms
each(
(state) => state.buckets[‘Partner Registration’],
(substate) => {
const createdObject = {
// add all your mappings in here
enumerator: substate.data.form.enumerator,
data_collection_date: substate.data.form.data_collection_date || ‘2024-01-01’,
partnership_start_date: substate.data.form.partnership_start_date,
partner_name: substate.data.form.partner_name,
establishment_date: substate.data.form.establishment_date,
registration_number: substate.data.form.registration_number,
sector_portfolio: (substate.data.form.sector_portfolio || “”).replace(" ", ", “).replace(“other”, substate.data.form.specify_sector_portfolio),
business_type: (substate.data.form.business_type || “”).replace(” ", ", ").replace(“other”, substate.data.form.specify_business_type),
legal_rep: substate.data.form.legal_rep,
legal_rep_sex: substate.data.form.legal_rep_sex,
partner_district: substate.data.form.partner_district,
partner_sector: substate.data.form.partner_sector,
gps_location: substate.data.form.gps_location,
email: substate.data.form.email,
contact: substate.data.form.contact,
provinces_of_operation: substate.data.form.provinces_of_operation,
districts_of_operation: substate.data.form.districts_of_operation,
partner_id: substate.data.form.partner_id
};
// add the created data to the second bucket
substate.upsertData['Partner Registration'].push(createdObject);
return substate;
}
);
// Now upsert your partner registration forms
upsertMany(
‘1_partner_register’,
(state) => state.upsertData[‘Partner Registration’]
);
Hi @maathe,
That probably has something to do with the shape of the input data. You may have to adjust the code or your input to make it work - although of course you need to make sure that your input data matches the actual input you’ll get in the workflow (or from the trigger or whatever your production run will start from).
I think your original input was something like this:
{
data: {
form: {
survey_type: "Partner Registration",
/* ... etc */
}
}
}
Ie, state.data
contains a single form object.
But that will never work with each
, because you only ever have one form. So I’ve assumed state.data
is an array of forms. Like this:
{
data: [ // start of forms array
{
form: {
survey_type: "Partner Registration",
/* ... etc */
}
}
] // end of forms array
}
I suspect that’s where the problem is.
By the way, if your job will only ever receive a single form object as its input, the whole thing gets a lot easier. Let me know and I’ll show you an even easier way to handle this.
On the other hand, maybe I’m just getting the shape of your data wrong here:
// Now filter your data into the buckets based on the survey type
each(‘$.data[*]’, (state) => {
if (!state.buckets[state.data.form.survey_type]) {
// Initialise the bucket for this data type if it doesn’t exist
state.buckets[state.data.form.survey_type] = ;
state.upsertData[state.data.form.survey_type] = ;
}
// add the data to the bucket
state.buckets[state.data.form.survey_type].push(state.data);
return state;
});
Note the state.data.form.survey_type
property chain. Perhaps that’s just incorrect? You might want to console.log state.data
inside that each
to better understand the shape of the data there.
Hi @jclark, I am trying to deal with a form at a time so only a single form object is received as input. The goal here is that; once a commcare form is submitted, the data of interest is automatically pushed to the database. The survey_type is simply a key in the json object that is read by OpenFn from CommCare.
The json object looks something like this in OpenFn:
{
“data”: {
“__query_params”: {
“app_id”: “”
},
“app_id”: “”,
“archived”: false,
“attachments”: {
“form.xml”: {
“content_type”: “text/xml”,
“length”: 2893,
“url”: “”
}
},
“build_id”: null,
“domain”: “tns-sandbox”,
“edited_by_user_id”: null,
“edited_on”: null,
“form”: {
“counter”: “5”,
“registration_number”: “TEST003344”,
“partner_id”: “”,
“consent”: “”,
“commcare_usercase”: {
“case”: {
“update”: {
“contact”: “0788999666”,
“counter”: “5”,
“partner_district”: “test district”,
“partner_name”: “Test14”,
“partner_sector”: “Test Sector”
}
}
},
“partnership_start_date”: “2019-02-06”,
“survey_type”: “Partner Registration”,
“districts_of_operation”: “test1 test2”,
I hope this gives you more context on what survey_type is. SO, the goal here is to deal with a single form object as it comes from CommCare and then have that pushed into the database
Ok! Well in that case this all a lot easier. We don’t need to do any stuff with each
or anything like that.
We can simply convert the object and upsert it into the right place.
How much processing do you need to do for each form type? It looks like in your example there’s a 1:1 mapping between the incoming form fields and the values you upsert to the database, is that right?
In which case, we should be able to run the same generic data transformation on each form, and then define a little mapping between form types and database tables.
I can help with both of these steps - just need to be sure I understand the requirements. If each form type is highly curated then we’ll need to use a different approach.
Hi @jclark,
For each form type, I mostly replace the ‘other’ values with the values in the ‘specify’ key just as you can see in the initial code I shared. The only other unique thing that might be done is if I decide to create a new value within openfn using a mathematical equation. (This rarely happens though but it is still something I would like to do. Creating new fields in OpenFn and then upserting them to corresponding fields in the database table)
And yes, there is a 1:1 mapping between incoming form fields and the values upserted to the database. The mapping maps the form fields to the database fields. This same mapping is run on each form that comes from CommCare.
That’s the goal of the entire process.
Ok @maathe, see if this helps at all.
This code will automatically and generically map the form object into shape for upserting, provide a hook for specialised transformations, work out which table to upsert to, and upsert it.
I’ve run a basic test on some input data with the upsert removed and it broadly seems to work.
You’ll have to test this carefully and think about the mapping steps though. If you’re doing a lot of special mappings you might want to rethink it.
Feel free to ask questions - there’s lots to be improved about this example!
// First, let's map our incoming form object
// to a single object called "fields", which is what we'll upsert
fn((state) => {
const form = state.data.form;
const fields = {};
// Set up the collection date, just like your original code
if (form.data_collection_date == '') {
fields.data_collection_date = '2024-01-01';
} else {
fields.data_collection_date = form.data_collection_date;
}
// Now map all keys of form onto the fields object
for (const key in form) {
// Ignore the survey_type key though
if (key !== 'survey_type') {
fields[key] = form[key];
}
}
// finally, write the converted fields onto state
// we'll do more with this in a minute
state.fields = fields;
return state;
});
// If you want to do any form-specific transformation, I would do it here, in a new fn block
// It's just a bit tidier and makes it easier to find any type-specific transformations
fn((state) => {
if (state.data.form.survey_type === 'Partner Registration') {
const business_type = state.data.form.business_type.replace(' ', ', ');
const other_business = state.data.form.specify_business_type;
state.fields.business_type = business_type.replace('other', other_business);
// maybe we want to tidy some data on our upsert object too
delete state.fields.specify_business_type;
}
return state;
});
// Now we'll work out which table to map this form to
fn((state) => {
// This object maps form types to table names
// it's like a lookup table, or an index in a book
const mappings = {
'Partner Registration': '1_partner_register',
};
state.tableName = mappings[state.data.form.survey_type];
return state;
});
// Now do the actual upsert
// I've changed upsertMany to upsert here, since we're only dealing with one item
upsert(
// Lookup the correct table name out of state
(state) => state.tableName,
// And lookup the data to upsert
(state) => state.fields
);
Thanks for this @jclark. I will try and test this out and give you feedback.
To follow up on this; is this same code repeated for multiple forms? For example, if i am pushing data from 3 separate forms (partner registration, beneficiary registration, beneficiary followups), do I have to write this entire function for each of the three forms since they will be pushing to 3 separate tables?
This job script should work for all 3 form types. If you need to add form-specific stuff, you can add rules under an if statement (like I’ve done for business_type).
You can make the functions as complex as you like to handle the different forms, but basic transformation and upsert should just work right out of the box