qwerty
October 26, 2025, 8:30am
1
Hello,
I installed only lightning source in local server ( GitHub - OpenFn/lightning: Lightning ⚡️ is latest version of the OpenFn platform, a DPG and DPI building block that governments use to manage complex service/workflow automation and data integration projects. ). But i cant run simple (or empty) workflows. i attached screenshots. do i need install other sources?
version: v2.14.13-pre1
Status: starting → (enqueued →) lost. system logs: lightning_worker_1 | [SRV] ❯ Connected to worker queue socket
lightning_worker_1 | [SRV] Connected to Lightning at ws://web:4000/worker
lightning_worker_1 | [SRV] Starting workloop
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 16ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 17ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 13ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 14ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 14ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 19ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 10ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 5ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 7ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 5ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 8ms (-)
lightning_worker_1 | [SRV] ❯ requesting run (capacity 0/5)
lightning_worker_1 | [SRV] ❯ claimed 0 runs in 7ms (-)
joe
October 27, 2025, 7:43am
2
Hi @qwerty , thanks for reaching out!
I can see your logs but not your screenshots.
You don’t need to install anything else, your logs look absolutely fine. Can you search or grep for “claimed 1 runs” anywhere in the logs?
Can you tell me more about your local setup? How are you starting Lightning? How are you triggering runs?
qwerty
October 27, 2025, 8:22am
4
Hi, friend? i just attached pics. pls check it.
OS: ubuntu server 24 LTS.
worker cant claim any runs. all attempts got LOST.
docker-compose.yml :
x-lightning: &default-app
build:
dockerfile: Dockerfile
context: ‘.’
args:
‘MIX_ENV=prod’
‘NODE_ENV=production’
depends_on:
‘postgres’
restart: ‘unless-stopped’
stop_grace_period: ‘3s’
services:
postgres:
image: ‘postgres:15.12-alpine’
restart: ‘unless-stopped’
deploy:
resources:
limits:
cpus: ‘${DOCKER_POSTGRES_CPUS:-0}’
memory: ‘${DOCKER_POSTGRES_MEMORY:-0}’
environment:
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
POSTGRES_DB: ${POSTGRES_DB}
stop_grace_period: ‘3s’
volumes:
‘postgres:/var/lib/postgresql/data’
web:
<<: *default-app
deploy:
resources:
limits:
cpus: ‘${DOCKER_WEB_CPUS:-0}’
memory: ‘${DOCKER_WEB_MEMORY:-0}’
dns: #
8.8.8.8
8.8.4.4
environment:
DATABASE_URL: ${DATABASE_URL}
WORKER_SECRET: ${WORKER_SECRET}
SECRET_KEY_BASE: ${SECRET_KEY_BASE}
WORKER_RUNS_PRIVATE_KEY: ${WORKER_RUNS_PRIVATE_KEY}
PRIMARY_ENCRYPTION_KEY: ${PRIMARY_ENCRYPTION_KEY}
PHX_HOST: ${PHX_HOST}
PHX_PROTO: ${PHX_PROTO}
SERVER_URL: ${SERVER_URL}
URL_SCHEME: ${URL_SCHEME}
PORT: ${PORT}
HOST: ${HOST}
MIX_ENV: prod
NODE_ENV: production
ERLANG_NODE_DISCOVERY_VIA_POSTGRES_ENABLED: ${ERLANG_NODE_DISCOVERY_VIA_POSTGRES_ENABLED}
depends_on:
postgres
healthcheck:
test: ‘${DOCKER_WEB_HEALTHCHECK_TEST:-curl localhost:4000/health_check}’
interval: ‘10s’
timeout: ‘3s’
start_period: ‘5s’
retries: 3
ports:
“0.0.0.0:4000:4000”
volumes:
./repo:/tmp/openfn/worker/repo
worker:
image: ‘openfn/ws-worker:latest’
user: “0:0”
restart: always
#deploy:
#resources:
#limits:
#cpus: ‘${DOCKER_WORKER_CPUS:-0}’
#memory: ‘${DOCKER_WEB_MEMORY:-0}’
depends_on:
web
dns:
8.8.8.8
8.8.4.4
environment:
DATABASE_URL: ${DATABASE_URL}
WORKER_SECRET: ${WORKER_SECRET}
PRIMARY_ENCRYPTION_KEY: ${PRIMARY_ENCRYPTION_KEY}
MIX_ENV: prod
NODE_ENV: production
NODE_OPTIONS: “–dns-result-order=ipv4first”
command: [‘pnpm’, ‘start:prod’, ‘-l’, ‘ws://web:${PORT@openfntartopenfntart/worker’]
stop_grace_period: ‘3s’
expose:
‘2222’
volumes:
postgres: {}
joe
October 27, 2025, 10:30am
5
Are you able to isolate and attach a complete log for just the worker’s output? Or a complete log of everything would still help.
It takes an hour for a run to be marked as Lost (we do that when we worker times out, basically), but I don’t need an hour of logs. Just a minute or so around the triggering of a run.
qwerty
October 27, 2025, 10:51am
6
job is simple: name: workflow1
jobs:
Transform-data:
name: Transform data
adaptor: “@openfn /language-common@latest”
body: |-
// Check out the Job Writing Guide for help getting started:
// Job Writing Guide | OpenFn/docs
fn(state => {
console.log(“ Test job running..123!”);
return { success: true, timestamp: new Date().toISOString() };
});
triggers:
webhook:
type: webhook
enabled: true
edges:
webhook->Transform-data:
condition_type: always
enabled: true
target_job: Transform-data
source_trigger: webhook
after several minutes its got LOST. it cant started.
its worker log:
~/lightning# docker compose logs -f worker
worker-1 |
worker-1 | > @openfn /ws-worker@1.18.0 start:prod /app/packages/ws-worker
worker-1 | > node dist/start.js -l ws://web:4000/worker
worker-1 |
worker-1 | [SRV] Starting worker server…
worker-1 | [SRV] ❯ Creating runtime engine…
worker-1 | [SRV] ❯ Engine options: {
worker-1 | “memoryLimitMb”: 500,
worker-1 | “maxWorkers”: 5,
worker-1 | “statePropsToRemove”: [
worker-1 | “configuration”,
worker-1 | “response”
worker-1 | ],
worker-1 | “runTimeoutMs”: 300000,
worker-1 | “workerValidationTimeout”: 5000,
worker-1 | “workerValidationRetries”: 3
worker-1 | }
worker-1 | [RTE] Using default repo directory: /tmp/openfn/worker/repo
worker-1 | [RTE] repoDir set to /tmp/openfn/worker/repo
worker-1 | [RTE] memory limit set to 500mb
worker-1 | [RTE] statePropsToRemove set to: [
worker-1 | “configuration”,
worker-1 | “response”
worker-1 | ]
worker-1 | [RTE] ❯ Loading workers from /app/packages/engine-multi/dist/worker/thread/run.js
worker-1 | [RTE] ❯ pool: Creating new child process pool | capacity: 5
worker-1 | [RTE] ❯ pool: Created new child process 25
worker-1 | [RTE] ❯ pool: finished task in worker 25
worker-1 | [RTE] ❯ Engine worker validated in 1508ms
worker-1 | [SRV] ❯ Engine created!
worker-1 | [SRV] ❯ Creating worker instance
worker-1 | [SRV] WARNING: deprecated socketTimeoutSeconds value passed.
worker-1 |
worker-1 | This will be respected as the default socket timeout value, but will be removed from future versions of the worker.
worker-1 | [SRV] ❯ Worker options: {
worker-1 | “port”: 2222,
worker-1 | “lightning”: “ws://web:4000/worker”,
worker-1 | “sentryEnv”: “dev”,
worker-1 | “noLoop”: false,
worker-1 | “backoff”: {
worker-1 | “min”: 1000,
worker-1 | “max”: 10000
worker-1 | },
worker-1 | “maxWorkflows”: 5,
worker-1 | “payloadLimitMb”: 10,
worker-1 | “messageTimeoutSeconds”: 30,
worker-1 | “claimTimeoutSeconds”: 3600
worker-1 | }
worker-1 | [SRV] Worker cute-toes-fall listening on 2222
worker-1 | [SRV] WARNING: no collections URL provided. Collections service will not be enabled.
worker-1 | [SRV] Pass --collections-url or set WORKER_COLLECTIONS_URL to set the url
worker-1 | [SRV] Worker started OK
worker-1 | [SRV] ❯ Connecting to Lightning at ws://web:4000/worker
worker-1 | [SRV] ❯ Reporting connection error to sentry
worker-1 | [SRV] ✘ CRITICAL ERROR: could not connect to lightning at ws://web:4000/worker
worker-1 | [SRV] ❯ connect ECONNREFUSED 172.21.0.3:4000
worker-1 | [SRV] ❯ queue socket closed
worker-1 | [SRV] Connection to lightning lost
worker-1 | [SRV] Worker will automatically reconnect when lightning is back online
worker-1 | [SRV] ❯ Connected to worker queue socket
worker-1 | [SRV] Connected to Lightning at ws://web:4000/worker
worker-1 | [SRV] Starting workloop
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 57ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 21ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 6ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 22ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 22ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 19ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 23ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 21ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 10ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 6ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 7ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 6ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ Connected to worker queue socket
worker-1 | [SRV] Connected to Lightning at ws://web:4000/worker
worker-1 | [SRV] ❯ claimed 0 runs in 828ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 8ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 14ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 7ms (-)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ claimed 0 runs in 8ms (-)
joe
October 27, 2025, 12:30pm
7
I’ve never run the app out of docker myself, but this looks weird to me:
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ requesting run (capacity 0/5)
worker-1 | [SRV] ❯ Connected to worker queue socket
worker-1 | [SRV] Connected to Lightning at ws://web:4000/worker
worker-1 | [SRV] ❯ claimed 0 runs in 828ms (-)
Every requesting run log should be followed by a claimed N runs log. But here, it’s like one of the requests failed - presumably while downloading a run - and the connection to Lightning dropped.
Can’t think why that would happen - it’s not something I’ve seen before
Are you able to isolate the corresponding Lightning logs for me? We should see the same request/claim cycle, and I wonder if there’s any error or warning from the Lightning side
qwerty
October 28, 2025, 10:42am
8
Hello, I attached some logs of lightning after failed run. pls check it. And .env file.
i edited ./config/runtime.exs file before built lightning:
~/lightning/config# cat runtime.exs
Lightning.Config.Bootstrap.source_envs()
Lightning.Config.Bootstrap.configure()
import Config
env = System.get_env(“MIX_ENV”) || “prod”
if env == “prod” do
host = System.get_env(“HOST”) || “66.42.57.64”
port = String.to_integer(System.get_env(“PORT”) || “4000”)
config :lightning, LightningWeb.Endpoint,
http: \[ip: {0, 0, 0, 0}, port: port\],
url: \[host: host, port: port, scheme: "http"\],
server: true,
check_origin: \["http://#{host}:#{port}"\]
# Worker socket
#config :lightning, Lightning.Runtime.RuntimeManager,
#start: true
# Worker secret
config :lightning,
worker_secret: System.get_env("WORKER_SECRET")
end
joe
October 28, 2025, 12:11pm
9
I think this may be because WORKER_RUNS_PRIVATE_KEY is unset (or set to the wrong value)
In your runtime.exs can you ensure that the private_key is set? Take a look at config/dev.exs for an example
1 Like
qwerty
October 28, 2025, 1:00pm
10
The keys stored in .env file. I attached env_file file.
Here:
WORKER
RUNS
PRIVATE
KEY=937ac25f760759bf51576ee889f8f16b371bdea04495d58fa521935a32456bce
qwerty
October 29, 2025, 4:54am
11
i just set private_key in runteme.exs. its working. thank you friend.
1 Like
joe
October 29, 2025, 8:05am
13
That’s great news @qwerty !
We’re investigating the issue and trying to work out where there’s no good clean error in this case. It shouldn’t have been so hard to diagnose!