Custom adaptor brainstorm

Background

I built a Gmail adaptor that queries the inbox by date. Since the Gmail API only officially supports querying by date range (e.g. after:2024-12-05), granularity is limited to daily increments.

Therefore, in order to support running the adaptor more frequently, the adaptor relies on a list of messages ids of messages that have already been processed. Each time the workflow is run, it pages through the message results and skips any messages that have already been processed, and adds to the list any newly processed message ids.

The new collections adaptor presented itself as a perfect mechanism for persisting this list between workflow runs. Here is a sample job implementation:

let userId, query, desiredContents; // omitted for brevity
const collectionName = "gmail-processed-ids";
const jobName = "job 1";

collections.get(collectionName, jobName); // adds a list of ids to state.data

// Gmail adaptor, expects a list of message ids in the state.data.
getContentsFromMessages(userId, query, desiredContents);

// The Gmail adaptor replaces state.data with the contents of the messages.
// It adds state.newProcessedIds as an updated list of processed messages,
// including the removal of messages no longer in the query result set.

// If state.newProcessedIds doesn't exist or is null, the Gmail request failed.
// Otherwise, the new list of ids is upserted into the collection.
fnIf(
  (state) => Array.isArray(state.newProcessedIds),
  collections.set(collectionName, jobName, $.newProcessedIds)
);

// Cleanup the ids since they're not used for anything else.
fn((state) => {
  state.newProcessedIds = null;
  return state;
});

Question

Is there any way to reference the collections adaptor inside the custom Gmail adaptor itself? All of the collections interactions are purely administrative and aren’t needed outside the adaptor except for persistence and retrieval into the collections.

If not collections, any thoughts on a different mechanism? Before collections, I originally thought of some externally hosted web api that would provide basically the same functionality as collections.

If I could handle the logic and persistence for processed ids inside the Gmail adaptor, the job would be greatly simplified to:

let userId, query, desiredContents; // omitted for brevity
const jobName = "job 1";
getContentsFromMessages(userId, query, desiredContents, jobName);

Hi Jason,

Well you can always import the collections adaptor inside the gmail adaptor (just import { collections } from '@openfn/language-collections'

The one issue you’d have right now is that the collections credentials are only added if the collections adaptor is actually used in the job.I think if you commented collections.get at the top of the job, that would be enough to fool it. I don’t think there’s a good reason for us to be strict about that, so we’ll think about lifting that requirement.

I would question whether it’s appropriate to couple the gmail adaptor to a collection like that though. Would different users have different requirements? I expect so?

Perhaps it’s alright if you pass in the collection name to persist to?

But my instinct is that it’s more appropriate to keep this behaviour in the job code.