Access to request header

Dear all,

openFn make it easy to access the body of the request, is there anyway to access the header ? (I am getting the username of the submitter in the header, I’d like to use it )

thanks in advance

Hello @delcroip , Can you share what adaptor your using ?, If your using http adaptor you can access the response from state.response. For example

get('https://jsonplaceholder.typicode.com/users/1', {}, state => {
  const { headers } = state.response;
  console.log(headers);
  // const username = // find username from headers
  return { ...state, username };
});

Thank you but my issue is that I want the header of the triggering request, not a request I generate in the job

br

@delcroip I was also looking to implement some job logic based on request headers, but I’m getting the sense that this might not be possible out of the box.

While testing a job in the v1 console (a job triggered by an HTTP request) I can confirm that I’m not seeing request headers in the state.

According to the docs, the HTTP request trigger creates the following state object, so that would provide conclusive evidence.

Triggering Event Initial State
http request { data: httpRequest.body, configuration: job.credential.body }

Unfortunately, the same seems to be true for Lightning (v2), as there are no request headers in the state on that version of the platform either.

Best of luck.

Justin

2 Likes

Thanks for this feedback,

@taylordowns2000 could you ask the dev how we could add the headers into the state with lighning ?

br

Hey folks, sorry… we’ve got most of the product team at on offsite right now so we’ll be slow to reply in earnest. That being said, if I were to do this, I’d modify line 48 here so that you’re providing a map with the body and the request headers from the conn:

https://github.com/OpenFn/Lightning/blob/main/lib/lightning_web/controllers/webhooks_controller.ex#L48

Then, follow it downstream a bit to where state assembly happens, line 100 here is key:

https://github.com/OpenFn/Lightning/blob/main/lib/lightning/pipeline/state_assembler.ex#L100

To get this in we’ll want to see that the initial contract has been extended, and not broken. The original contract is that we’ll put the body of the HTTP request in state.data when a run is started with an :http_request dataclip. Your target should be (only in my opinion!) to preserve that rule and then also have the headers of an HTTP request get put in state.headers. (call that “option a”)

My guess is that there will be (and should be!) a bit of discussion before merging about two things:

  1. whether or not we should break the contract and put both body and headers inside state.data. (call that “option b”)

  2. (assuming “no” to the above) whether we should put headers at the top-level in state, or inside a request object, as in state.request.headers so there’s room for other stuff in the future. (call that “option c”)

These are 3 possible shapes of “initial state” when an attempt is triggered by an http request

option a - new “headers” key in state:

{
  configuration: {...}, // set by credential/secrets handler
  data: {...}, // the body of the http request
  headers: {...} // the headers of the http request
}

option b - break the contract! (data contains “body” and “headers”):

{
  configuration: {...}, // set by credential/secrets handler
  data: {
    headers: {...}, // the headers of the http request
    body: { ...} // the body of the http request
  }
}

option c - new “request” key in state:

{
  configuration: {...}, // set by credential/secrets handler
  data: {...}, // the body of the http request
  request: {
    headers: {...}, // the headers of the http request
    ... // anything else we might want to stick in here later
  }
}

cc @aleksa-krolls , @stu , @jclark , @aissatoudiallo , @mtuchi , @nick

  • concept - new “headers” key in state
  • alt1 - break the contract! (data contains “body” and “headers”)
  • alt2 - new “request” key in state
0 voters
1 Like

Thanks a lot I will check this out.

I voted for alt2 because it could also be interesting to have the auth provider and username/api key injected here in case there is “personalized” logic

Thanks @taylordowns2000. Sorry for the delay. I didn’t get a notification from Discourse about your response so just checking in a week late.

My guess is that there will be (and should be!) a bit of discussion before merging about two things:

I assume this was less about an oversight / missing feature and more about a concern with leaking security or implementation details. So definitely take your time with these discussions and changes. My interim solution (if we need it before there’s a resolution to this) is to add some approximation of the header state to our event payload. We were trying to avoid this because it requires a code change instead of a simple configuration change. OpenBoxes allows a system administrator to configure the webhook events to carry arbitrary headers.

But I have a question to ensure we’re not going down the wrong path. We essentially just need to pass the runtime environment (test, staging, prod) from which a message is produced because we share an OpenFn instance between our test and staging servers. We could also possibly determine this from the FQDN of the request. My question is would you advise against a shared OpenFn instance for multiple environments? Or do you think that’s ok? Is there a better way to approach this type of filtering?

Regardless of the response, adding support for request headers would be useful as there are other use cases where they might be useful.

These are 3 possible shapes of “initial state” when an attempt is triggered by an http request

When I first started thinking about the problem, I expected to find the headers alongside config and data in the root of the message. However, looking at the options you mentioned, I don’t feel :100: about any of these solutions yet.

Option (a) seems fine and is what I expected, but only because I was thinking lazily.

As you mentioned, option (b) breaks everything, so I don’t think that should even be considered unless it was being included as part of a refactoring in a major release. Too many people would be impacted, so it doesn’t really make sense. Actually, I guess it could make sense for Lightning since that’s still in beta (?), but I still do not know if I like it as an option since it suffers from the same issue as (c).

I don’t like (c) for the reason I mentioned earlier … it’s leaking implementation details that don’t really need to be there.


So perhaps there’s an option (d).

One thing I noticed while looking at the Trigger docs is that it seems like each Trigger Type has a different shape for its state data. It seems you could potentially create a single state message for any trigger and populate them according to what information is available to the trigger.

{
  config: {...}, 

  data: {...},        // could also be called body if you were okay with breaking
                      // the contract but I don't think that's necessary

  context: {...}      // could also be called metadata
}

This common structure would allow you to pass different contextual data for different trigger types

  • HTTP request headers for message triggers (i.e. where is the request coming from, does it want to be retried, number of retries etc)
  • read-only metadata related to cron triggers (i.e. next execution time)
  • standard metadata that can be set by the client either specified in the job or message trigger (i.e. a start delay)
  • common metadata that needs to be shared across jobs (i.e. number of retries, current retry count)
  • shared artifacts or results generated by a job execution (i.e. statistics, message transformation, saved state or context) to be passed throughout the current pipeline or to a future execution

I am basing the idea off some work I’m doing with Rest Assured / TestNG right now, but the same concept exists in other systems like build pipelines and job scheduling. The client has some ability to pass state to the context, but a lot of the context is generated by the execution engine before/during/after the current “job” is executed.

I see your concern but it works both ways, having this information can help because you might not always be able to control the header/payload structure in the sender systems.

but I think there is one big attention points to respect: No secret should be saved

br

@jmiranda - Your reply really resonates with me. As you noted, Options A and C result in implementation details becoming top-level keys in the object, peers with “data”. In contrast, your suggestion of a “context” key does not result in implementation details at the top-level and it establishes a helpful expectation among users about where to find “that stuff that is neither config nor data”. This is a nice characteristic. For avoidance of doubt, I expect that “context” can (and should) have child-keys that are implementation-specific (e.g., http.request.headers), but this preserves the consistency of top-level keys.

I also like your idea that you could set specific context data (almost like “tags”) in the definition of the trigger. I hadn’t thought of that, but I can imagine the utility.

In the context of HTTP, I’m also interested in other details of the request. You mention IP address. User Agent might be interesting. All of these sorts of details would fold nicely into a “context” object.

When it comes to security, I would tentatively suggest that we consider handling this with a whitelist or blacklist of HTTP Header fields. I’m sure that we’ve all seen (and worked with) a variety of authentication / authorization implementations, some of which are based on standards and some aren’t. It feels like the responsibility of the workflow builder (not the OpenFn platform) to indicate what headers are interesting or sensitive (etc.).

Benson

1 Like

@taylordowns2000 I tried to apply the change you proposed but the repository changed so much from my version that I cannot apply it.

I tried to get the latest version (main) but docker complain that image: 'ws-worker:latest' cannot be found.

Do you know how to get that image ?

br

More to come on the convention questions above, but @delcroip for your last q it’s possible that we’ve got a :latest tagging issue on dockerhub. Apologies if that’s the case. But please do stick with the named releases at least until we hit a stable 1.0

v0.9.3 is the last full release for Lightning.

That said, we’ve got lots of very fun stuff on main (that’s usually listed as :edge if you’re pulling from dockerhub) and you’re welcome to experiment with the v0.10.0-pre1,2,3,4... releases but we don’t expect to be at v0.10.0 until next week.

Also, based on conversations like the one above and the opportunity to provide a migration path from the old OpenFn platform users to Lightning when we reach a stable release later this year, we might still introduce some pretty heavy breaking changes.

If you’re having trouble with :edge stuff, we’d still love to hear about it (thank you for making noise :heart:, this info is invaluable) but please start another thread so we keep this one going around the headers/data/body/whats-the-ideal-shape-of-initial-state conversation!

Taylor

I might need some support,

I am trying to implement it but I got an error and I am not sure how to solve it (I never worked with ecto/elexir)

I change the dataclip injection with

“lib/lightning_web/controllers/webhooks_controller.ex” 54L

%Workflows.Trigger{enabled: true} = trigger ->
        {:ok, work_order} =
          WorkOrders.create_for(trigger,
            workflow: trigger.workflow,
            dataclip: %{
              body: conn.body_params,
              headers: Enum.into(conn.req_headers, %{}),
              type: :http_request,
              project_id: trigger.workflow.project_id
            }
          )

        conn |> json(%{work_order_id: work_order.id})

and the dataclip schema with

“lib/lightning/invocation/dataclip.ex” 116L

schema "dataclips" do
    field :body, :map, load_in_query: false
    field :headers, :map , load_in_query: false
    field :type, Ecto.Enum, values: @source_types
    belongs_to :project, Project

    has_one :source_run, Run, foreign_key: :output_dataclip_id

    timestamps(type: :utc_datetime_usec)
  end

I created a migration :

defmodule Lightning.Repo.Migrations.AddHeadersColumn do
  use Ecto.Migration


  def change do
    alter table(:dataclips) do
      add :headers, :map
    end
  end
end

But if I try to access state.headers I am getting an exception

(the headers are properly saved in the dataclips tables)

what did I missed ?

@delcroip , and everyone else here, thanks for your input and patience!

after all the input, we’ve decided to move forward with “alt2 - new request key in state”

  1. Add storing webhook request headers in Dataclips for use in jobs by stuartc · Pull Request #1663 · OpenFn/Lightning · GitHub
  2. display http_request dataclips in their `initial state` format #1664 by taylordowns2000 · Pull Request #1665 · OpenFn/Lightning · GitHub

Lightning v2.0.0-rc6 and beyond will have this change. If you want it now, use anything past 6ae77b0 .

@delcroip , you’ll now be able to write a job like this:

fn(state => {
  const { data, request } = state;
  console.log('the headers are', request.headers);
  const user = request.headers.user_name // or whatever
  return { state, user };
});
1 Like