Any others seeing ODK Aggregate JSON publisher stalling (permanently) from time to time?

From time to time (once every month or two, but twice in the last two weeks) we see our ODK form publisher go into the PAUSED state.
It’s meant to retry automatically (every 15-30min I think) and recover, but we are finding that if a POST to OpenFn failed once it doesn’t ever recover and we have to manually delete and re-create the publisher to recover.
It’s always a pain to figure out how many records have been missed and re-publish them only (we sometimes have >1k submissions and don’t just want to throw the whole lot at OpenFn as that’s messy, even though it could handle it fine.

I’m guessing the problem lies in the ODK code, and I’ll still post on their mailing list (which is currently in transition for 2-3 days), but my question to the OpenFn team is this:

Do you often see ODK JSON publishers stalled on other projects that you are running/supporting?

The JSON publisher is still in Alpha, which makes we wonder if there may still be some bugs that need addressing. Although the retry mechanism applies to all the publishers (including those not in Alpha)

Thanks,
Andrew

errorlog3.png

errorlog4.png

errordetails1.txt (7.61 KB)

errordetails2.txt (7.5 KB)

UPDATE:

We’ve actually narrowed down the problem to:
a) Specific submissions containing the “é” character in some fields
b) Publishers pointing at OpenFn. If we point at our own endpoint then the publisher runs through fine (See attachment)

Having a look at our own endpoint shows that the “é” character is replaced with the replacement character (Hex: EF BF BD, Unicode: U+FFFD)
We’re guessing this happens in ODK’s publisher but are not sure - always difficult to trace exactly where encoding changes/problems occur.

So the revised question is: Would OpenFn’s inbox endpoint reject POSTs containing some form/encoding of the replacement character?

Just copying and pasting the character (�) and posting to OpenFn seems to work fine.

This is the text we think is causing our problem:

50 65 74 72 EF BF BD 20 4C 65 20 52 6F 75 78

Petr� Le Roux

Thanks,
Andrew

OpenFn publisher getting stuck.png

Last comment, it may well be related to something like this on the ODK side, but we’re still wondering if/why OpenFn is discarding (not responding to) the POSTs.

Hey Andrew,

I can’t get past what you’ve already confirmed, that Open Function responds with a 200 and an empty JSON object to all three examples you’ve given. (See attached.) I’m off to the ODK GitHub repo to see what the exact mechanism is for posting data.

More soon!

Taylor

strange_char_posts.png

Update here: https://github.com/opendatakit/opendatakit/issues/1283

Software and hardware versions

Aggregate v1.4.7 Production

Problem description

The ODK Aggregate JSON Publisher appears to use Latin-1 encoding, despite seeming to demand UTF-8. This causes failures at least when sending survey data containing the special character é.

65 72 5f 6e 61 6d 65 22 	e r _ n a m e " 
3a 22 54 e9 6c 6f 72 22 	: " T . l o r " 
7d 5d 7d                	} ] } 

Possible fix?

String Entity toString doesn’t have charset
I’ll submit a PR right now to show the proposed change, but I can’t run this on my local machine yet.

Steps to reproduce the problem1. Spin up Aggregate on Google AppSpot

  1. Set up a form with an active JSON Alpha Publisher that sends data to a server which will give you access to the binary. I use https://ngrok.com/ so inspect the requests.
  2. Submit data via ODK Collect which includes a special character. I’ve used “Télor” in this example.
  3. Inspect the POST coming from Aggregate and note that the é is encoded as e9 (binary in hex), rather than c3 a9. (Inspector image below.)
    image

Note that while it’s less obvious, you can tell the difference using Requestb.in. In the image below you’ll see that the second post (from ODK Aggregate) is Latin-1 encoded—we know this because it only contains 435 bytes, while the first post, made by me using UTF-8 encoding, is 436 bytes.
image

Expected behavior

Posts to JSON servers should use UTF-8 encoding.

45 6e 74 65 72 5f 6e 61 	E n t e r _ n a 
6d 65 22 3a 22 54 c3 a9 	m e " : " T . . 
6c 6f 72 22 7d 5d 7d    	l o r " } ] } 

Other information

I’m wondering if this is an issue with ODK Aggregate, or with Tomcat. It seems that you can set the request headers, but I’m not a Java developer and haven’t spun up ODK Aggregate on my machine.

These are what seem to be the relevant bits:

  1. Defining UTF8_ENCODE in HtmlConsts
  2. Create submissionJsonObj
  3. Send the post using HttpEntity
  4. String Entity toString doesn’t have charset—this might be the culprit

@acawood777 thanks for bringing this to my attention
@rorymckinley thanks for digging in with me

Submitted a PR to ODK Aggregate, and also looking into how we could (without getting really dirty) convert Latin-1 encoding to UTF-8 in our Phoenix plug.

Hi Taylor,

Fantastic work, thanks for tracking that down! Please keep us posted on
whether you're able to come up with a workaround on your end.
We'd even be happy with a replacement character being populated instead of
the é initially, as long as OpenFn responds with a 200 so the ODK publisher
doesn't get stuck on that submission.

We've updated our ODK form so that there is no longer an é-character in one
of the dropdown lists, but it's only a matter of time before someone uses a
character like that in a text field somewhere!

Thanks again,
Andrew