Applyr Auto-Apply Engine
Multi-platform job auto-apply engine with AI-powered form filling, anti-bot detection, pipeline tracking, and session management
Applyr Auto-Apply Engine
The Jobs page (/admin/jobs) manages Applyr, an automated job application engine that handles multi-platform submissions (LinkedIn Easy Apply and Indeed Apply), AI-powered form filling, anti-bot detection hardening, pipeline tracking, and follow-up scheduling. The engine runs as an OpenRC-managed service on callisto (10.42.0.100) and communicates with the Arcturus-Prime admin UI via a secure API proxy.
Architecture
Arcturus-Prime Admin UI (/admin/jobs)
|
| HTTPS via Cloudflare Pages proxy
| Auth: Cloudflare Access + API key injection
v
API Proxy (/api/jobs/[...path].ts)
|
| HTTP to 10.42.0.100:8585
| X-Api-Key injected server-side
v
Applyr FastAPI Service (port 8585)
|
├── Patchright (undetected Chromium)
├── Claude Haiku (cover letters, screening answers)
├── SQLite (application tracking)
└── Evidence capture (screenshots + HTML)
The API proxy at /api/jobs/[...path].ts handles authentication via Cloudflare Access, injects the X-Api-Key header server-side (so the key never reaches the browser), and forwards all requests to the Applyr service. SSE streams for real-time status updates are passed through without timeout.
Supported Platforms
| Platform | Status | Apply Method | Anti-Detection |
|---|---|---|---|
| Working | Easy Apply modal | Standard (Patchright) | |
| Indeed | Hardened | Indeed Apply (iframe) | Full anti-bot stack |
Anti-Bot Detection (Indeed)
Indeed uses Cloudflare WAF + Turnstile for aggressive bot detection. The engine implements eight layers of protection:
| Layer | Protection |
|---|---|
| Browser | Patchright — patches CDP Runtime.enable leak at source level (the #1 detection vector) |
| Fingerprint | Real Chrome channel with native UA/sec-ch-ua headers (no spoofing — avoids Cloudflare platform mismatch detection), randomized viewport from 5 common resolutions, --disable-blink-features=AutomationControlled |
| Headers | Accept-Language: en-US,en;q=0.9 — all other headers use Chrome’s real values to avoid fingerprint mismatch |
| Behavior | Bezier curve mouse movement, per-character typing with random pauses and rare typo+correction, variable-speed scrolling |
| Session | Warmup browsing on Indeed homepage before applying, cookie persistence |
| CAPTCHA | Auto-detection of 9 Cloudflare challenge indicators, 120-second manual solve window in headful mode |
| Login | Google SSO detection + passwordless verification code flow with 300-second timeout, 60-second manual fallback |
| Resilience | 57 fallback CSS selectors across 13 element types, retry with exponential backoff (3 attempts) |
| Rate Limiting | 3–10 second extra delay between Indeed page loads |
CAPTCHA Handling
When Indeed presents a Cloudflare challenge:
- The engine detects it automatically (checks for Turnstile iframe, challenge wrapper, page title changes)
- In headful mode (default), the engine pauses and waits up to 120 seconds
- The user solves the CAPTCHA manually in the visible browser window on callisto
- The engine detects resolution and continues the apply flow
- Failed applications retry up to 3 times with exponential backoff
Passwordless Login (Indeed)
Indeed uses passwordless authentication for Gmail accounts. The login flow handles three authentication paths:
| Path | Detection | Handling |
|---|---|---|
| Traditional password | input[type="password"] visible | Fill password, submit, check for challenge |
| Google SSO | ”Continue with Google” / “Sign in with a code instead” text | Click “Sign in with a code instead” → verification code flow |
| Direct verification code | ”Enter the code” / input[inputmode="numeric"] | Wait for manual code entry |
Flow:
Enter email → Submit → Challenge check
↓
Password field? → Traditional login
Google SSO page? → Click "Sign in with a code" → Wait for verification code
Verification code page? → Wait for manual code entry (300s timeout)
None detected? → Wait 5s, re-check → 60s manual fallback
The engine detects Google SSO with 3 selectors, verification code pages with 18 selectors (text patterns + input attributes), and uses 4 fallback selectors for the Continue button. In headful mode, the user enters the verification code from their email in the visible browser window.
Browser Fingerprint Strategy
The engine uses the real Chrome binary (chrome channel in Patchright) with no user-agent or header spoofing. Previous versions spoofed Windows UA headers on Linux, which Cloudflare Turnstile detected as a platform mismatch (error 600010). The current approach:
- Real Chrome channel with native UA and
sec-ch-uaheaders --disable-blink-features=AutomationControlledto suppress automation flags- Patchright patches CDP
Runtime.enableleak at source level - No
user_agent,sec-ch-ua,sec-ch-ua-platform, orsec-ch-ua-mobileoverrides - Randomized viewport and locale/timezone set to match physical location
Human Behavior Simulation
All interactions use human-like behavior rather than Playwright’s instant .fill() and .click() methods:
- Mouse movement: Bezier curves with random control points and offset jitter, not instant teleportation
- Typing: Per-character input with variable delays (50–150ms), occasional longer pauses (300–800ms), and rare typo-then-backspace sequences
- Clicking: Mouse moves to element first, waits 100–300ms, then clicks with random offset within element bounds
- Scrolling: Variable-speed smooth scrolling with random pixel amounts, not instant
scrollTo()
LinkedIn Form Handling
LinkedIn Easy Apply uses a multi-step modal with different HTML structures for different question types:
Standard Screening Questions
LinkedIn’s built-in questions (work authorization, visa sponsorship, years of experience, etc.) use fieldset[data-test-form-builder-radio-button-form-component] with a dedicated title element. These are detected and answered via keyword matching against the screening KB.
Employer Custom Questions
Employer-specific questions (“Do you agree to our salary range?”, “Please confirm you won’t use AI tools”, etc.) appear on the “Answer these questions from the employer” step. These use plain input[type="radio"] elements without the data-test-form-builder attributes.
The engine handles these with a separate bare radio handler that:
- Extracts the question text by trying
<legend>, long<span>elements (>10 chars), and non-radio<label>elements — avoiding grabbing “Yes”/“No” option labels as the question - Matches agreement/consent patterns — questions containing “agree”, “acknowledge”, “consent”, “confirm”, or “indicate yes” automatically return “Yes”
- Falls back to AI if no pattern matches
- Auto-selects first option as last resort (logged for KB expansion)
Field Detection Hierarchy
For each [data-test-form-element] on the current modal step:
| Priority | Type | Selector | Handler |
|---|---|---|---|
| 1 | Select (dropdown) | select | _handle_select() |
| 2 | Radio (standard) | fieldset[data-test-form-builder-radio-button-form-component] | _handle_radio() |
| 3 | Radio (employer) | input[type="radio"] (bare) | _handle_bare_radio() |
| 4 | Text input | input[type="text"] | _handle_text_input() |
| 5 | Number input | input[type="number"] | _handle_text_input() |
| 6 | Textarea | textarea | _handle_textarea() |
| 7 | Checkbox | input[type="checkbox"] | _handle_checkbox() |
AI Content
Resume Tailoring
The engine generates ATS-optimized resumes tailored for each specific job using Claude Haiku. The prompt strategy:
- Cherry-picks the 4 most relevant positions from the master resume
- Mirrors exact keywords and terminology from the job posting
- Uses the formula:
[Action Verb] + [What You Did] + [Quantified Impact]for each bullet - Lists 15-25 skills ordered by job posting relevance
- Includes both acronyms and full names (e.g., “Amazon Web Services (AWS)”)
- Generates 8-12 additional ATS keywords matching the candidate’s background
A supplementary data/candidate_context.md file provides background context (current situation, personal projects, gap framing) that the AI uses to position the candidate effectively.
AI Content Viewer
The admin UI provides full visibility into what the AI generates for each application. In the History and Queue tabs, each application card has a sparkle wand button that opens a detail modal with three tabs:
- Resume — Professional summary, skill tags, work experience entries with rewritten bullets, and additional ATS keywords
- Cover Letter — Full generated cover letter
- Answers — Screening question/answer pairs from the application form
This allows reviewing AI output before approving queued applications and auditing what was sent for submitted applications.
Session Management
Starting a Session
From the admin UI, click Start to open the session configuration modal:
-
Profile Selection — choose a candidate profile from
data/profiles/. The dropdown shows the candidate name with a subtitle showing current title, location, years of experience, and the number of screening answer categories. Profiles contain identity info, screening answers, career context, and resume references. The selected profile’s screening answers override the default KB during the session. -
CSV Selection — choose a scored CSV from
data/scored_jobs/. The dropdown shows filename, job count, file size, and last modified date. CSVs come from JobSpyAdvanced-Lab and must include columns:title,company,location,job_url,site,total_score,description. -
Score Threshold — minimum
total_scorefor auto-submission (default: 150). Jobs below this threshold are queued for manual review instead of auto-submitted. -
Max Applications — maximum applications per session (default: 25).
-
Platform Selection — checkboxes for LinkedIn and Indeed. Both are checked by default. The engine filters jobs by the
sitecolumn in the CSV to match selected platforms.
Session Flow
Load CSV → Filter by threshold + platform
↓
For each job:
1. Navigate to job URL
2. Detect apply method (Easy Apply / External ATS)
3. Fill application form (AI-powered answers)
4. Capture evidence (screenshot + HTML)
5. Submit or queue for review
6. Rate-limit delay (Indeed: 3-10s extra)
↓
Session complete → summary stats emitted via SSE
Stopping a Session
Click the Stop button (replaces Start when running). The engine finishes the current application, then stops. The progress bar shows X / Y applications processed.
Vision-Driven Form Filling
When enabled (APPLYR_VISION_ENABLED=true), the engine uses Claude’s vision API to understand each form step visually rather than relying solely on CSS selector pattern matching.
How It Works
On each form step:
- Screenshot the modal/form area
- Send the screenshot + candidate profile + job context to Claude’s vision API
- Receive structured JSON: field type, label, answer, confidence for every visible field
- Fill each field using Playwright, matching vision labels to DOM elements via fuzzy matching
- Fall back to legacy selector-based logic for any field where vision confidence is below the threshold
Why Vision
The legacy approach uses hardcoded CSS selectors to discover fields and extract labels. This breaks when:
- Employers use custom question formats without standard
data-testattributes - Radio button labels are in unexpected DOM locations
- Form layouts change between platform updates
- Questions are ambiguous without visual context
Vision sees the form exactly as a human would, and uses the candidate’s full profile to choose correct answers.
Configuration
| Key | Default | Description |
|---|---|---|
APPLYR_VISION_ENABLED | false | Enable vision form filling (env var) |
APPLYR_VISION_MODEL | claude-haiku-4-5-20251001 | Vision model (env var) |
vision_confidence_threshold | 0.5 | Below this, fall back to legacy selectors |
Fallback Chain
For each form field, the system tries:
- Vision answer (confidence >= threshold) — fill using AI’s visual interpretation
- Legacy selector — fall back to the original CSS-based field detection + pattern matching + AI text fallback
- If the entire vision API call fails, the full step falls back to legacy automatically
Watchdog & Self-Recovery
The engine has a multi-layer resilience system that prevents stuck states and automatically recovers when form fills hang.
Per-Job Timeout
Each job application is wrapped in asyncio.wait_for() with a configurable timeout (default: 180 seconds). If a single job exceeds this limit, the attempt is cancelled, the browser state is recovered, and the engine continues to the next job. The timeout SSE event includes the job title and company.
Watchdog Task
A background asyncio task monitors heartbeat timestamps every 10 seconds. The apply chain emits heartbeats at key checkpoints:
| Checkpoint | When |
|---|---|
detect_method | After determining Easy Apply vs External |
generating_resume / resume_generated | Before/after AI resume tailoring |
generating_cover_letter / cover_letter_generated | Before/after AI cover letter |
starting_form_fill | Before entering the form stepper |
linkedin_form_step_N | Each LinkedIn Easy Apply form step |
indeed_step_N | Each Indeed Apply form step |
evidence_captured | After screenshot/HTML evidence saved |
If no heartbeat is received for 90 seconds (configurable), the watchdog emits a watchdog_alert SSE event — an early warning before the hard timeout fires.
Circuit Breaker
Tracks consecutive failures. After 5 consecutive failures (configurable), the engine pauses for 120 seconds and emits a circuit_breaker SSE event. This prevents burning through the job list when something systemic is wrong (login expired, network issue, platform changes).
Recovery
On timeout, the engine:
- Presses Escape to dismiss any open dialogs
- Calls
dismiss_modal()to close LinkedIn/Indeed modals - Navigates to a neutral page (LinkedIn feed or Indeed homepage)
- If navigation fails, recreates the browser page entirely
- Waits 3 seconds for the page to settle, then continues to the next job
Stop Signal
The stop signal (ctx.check_stop()) is checked before each expensive operation and inside form loops. This makes stop responsive within 2-3 seconds, even mid-form-fill.
Config Keys
| Key | Default | Description |
|---|---|---|
per_job_timeout | 180 | Max seconds per single job application |
watchdog_stall_timeout | 90 | Seconds of no heartbeat before SSE alert |
circuit_breaker_threshold | 5 | Consecutive failures before pausing |
circuit_breaker_pause | 120 | Seconds to pause after circuit breaker trips |
Application Pipeline
The pipeline tracks applications through stages with a visual funnel:
| Stage | Description | Transition |
|---|---|---|
| Pending | Loaded from CSV, not yet processed | Auto → Submitted/Queued/Failed |
| Queued | Below threshold, awaiting manual review | Approve → Submitted, Reject → Skipped |
| Submitted | Application sent successfully | Manual → Screening/Interview/Offer/Rejected/Ghosted |
| Screening | Recruiter/phone screen stage | Manual → Interview/Rejected |
| Interview | Technical, onsite, panel rounds | Manual → Offer/Rejected |
| Offer | Received an offer | Terminal |
| Rejected | Got a rejection | Terminal |
| Ghosted | No response after follow-up window | Auto (30 days) or manual |
External ATS Handling
Jobs that don’t have Easy Apply get marked as external_opened — the engine navigates to the ATS URL but cannot fill external forms. These show in the queue with a “Confirm Submitted” button for manual confirmation after the user applies on the external site.
Review Queue
Jobs below the score threshold land in the review queue. Each queued application shows:
- Job title and company
- Score (highlighted if near threshold)
- Job URL (external link)
- Approve button — submits the application
- Skip button — marks as skipped
Follow-Up Tracking
The follow-up panel shows applications past their follow-up date without a status change. A banner at the top of the page shows the count of due follow-ups. Each entry shows the company, position, follow-up date, current status, and notes.
Auto-Ghost
The Mark Ghosted button in the header batch-marks all submitted applications older than 30 days with no status change as “ghosted”. This keeps the pipeline clean and the response rate metric accurate.
Stats Dashboard
The stats row shows aggregate counts across all application history:
- Total — all applications ever tracked
- Submitted — successfully sent
- Screening — in recruiter/phone screen
- Interview — in interview rounds
- Offers — received offers
- Rejected — got rejections
- Ghosted — no response after 30 days
- Response % — (screening + interview + offer + rejected) / submitted
Service Management
OpenRC Service
Applyr runs as an OpenRC service on callisto with supervise-daemon for automatic restart:
- Auto-start: enabled at the
defaultrunlevel — starts on boot - Auto-restart:
supervise-daemonrespawns the process on crash (non-zero exit) with a 3-second delay - Respawn limits: max 10 restarts within 60 seconds before giving up
GUI Controls
- Restart button: the header shows a “Restart” button when the service is online. It triggers a restart via the
service-restartendpoint (exits with code 1 sosupervise-daemonrespawns). - Offline auto-retry: when the health check fails, the UI shows a pulsing reconnect indicator with a 5-second countdown. It auto-retries and reloads the page when the service comes back.
- Uptime display: the subtitle shows “Engine idle · Up Xh Ym” when no session is running.
Service Commands
sudo rc-service applyr start
sudo rc-service applyr stop
sudo rc-service applyr restart
sudo rc-service applyr status
API Endpoints
All endpoints are proxied through /api/jobs/* with Cloudflare Access auth and server-side API key injection.
Session Control
| Method | Path | Description |
|---|---|---|
| POST | /api/jobs/start | Start apply session (body: csv_path, threshold, max_applications, platforms, profile) |
| POST | /api/jobs/stop | Stop current session |
| POST | /api/jobs/panic-stop | Emergency browser kill (force-closes Patchright) |
| GET | /api/jobs/status | Session status (running, progress, current job) |
| GET | /api/jobs/status/stream | SSE event stream for real-time updates |
Application Management
| Method | Path | Description |
|---|---|---|
| GET | /api/jobs/queue | Review queue (below-threshold applications) |
| POST | /api/jobs/queue/{id}/approve | Approve queued application |
| POST | /api/jobs/queue/{id}/reject | Reject queued application |
| GET | /api/jobs/history | Application history (limit param) |
| POST | /api/jobs/{id}/status | Transition application status (query: status, notes, follow_up_date) |
| POST | /api/jobs/{id}/confirm-external | Confirm external ATS submission |
| GET | /api/jobs/{id}/ai-content | Tailored resume, cover letter, screening answers |
| GET | /api/jobs/{id}/evidence | Get screenshot/HTML evidence paths |
| GET | /api/jobs/{id}/full-audit | Combined: DB record + audit trail + AI content + evidence |
| GET | /api/jobs/evidence-file/{id}/{filename} | Serve evidence screenshot/HTML file (binary) |
Pipeline & Stats
| Method | Path | Description |
|---|---|---|
| GET | /api/jobs/stats | Aggregate application statistics |
| GET | /api/jobs/pipeline | Pipeline funnel with response rate |
| GET | /api/jobs/follow-ups | Applications needing follow-up |
| POST | /api/jobs/mark-ghosted | Auto-ghost 30-day stale applications |
Profiles
| Method | Path | Description |
|---|---|---|
| GET | /api/jobs/profiles | List candidate profiles (slug, name, title, location, experience, answer count) |
| GET | /api/jobs/profiles/{slug} | Get full profile detail (excludes screening_answers for size) |
Data & Config
| Method | Path | Description |
|---|---|---|
| GET | /api/jobs/scored-csvs | Available scored CSV files |
| GET | /api/jobs/platforms | Available platforms and status |
| GET | /api/jobs/logs | Engine log buffer (since/limit params) |
| GET | /api/jobs/session-logs | Session log file list |
| GET | /api/jobs/session-logs/{name} | Read specific session log |
| GET | /api/jobs/config/paths | Current file path configuration |
| PUT | /api/jobs/config/paths | Update file paths at runtime |
Service Management
| Method | Path | Description |
|---|---|---|
| GET | /api/jobs/health | Health check (no auth required) |
| GET | /api/jobs/service-status | PID, uptime, memory usage |
| POST | /api/jobs/service-restart | Restart service (supervise-daemon respawn) |
Configuration
Key settings in applyr/config.py:
| Key | Default | Description |
|---|---|---|
headless | False | Keep false for CAPTCHA solving |
indeed_warmup_enabled | True | Browse Indeed before applying |
indeed_challenge_wait_timeout | 120 | Seconds to wait for CAPTCHA solve |
indeed_verification_code_timeout | 300 | Seconds to wait for email verification code |
indeed_max_retries | 3 | Retry attempts per application |
indeed_page_load_delay | (3, 10) | Extra seconds between page loads |
indeed_retry_backoff_base | 5 | Exponential backoff base (seconds) |
auto_submit_threshold | 150 | Score threshold for auto-submission |
Environment Variables
Applyr Service (.env)
| Variable | Required | Description |
|---|---|---|
ANTHROPIC_API_KEY | Yes | Claude API key for AI form filling |
INDEED_EMAIL | For Indeed | Indeed login email |
INDEED_PASSWORD | Optional | Indeed password (not needed for Gmail — uses passwordless verification code) |
LINKEDIN_EMAIL | For LinkedIn | LinkedIn login email |
LINKEDIN_PASSWORD | For LinkedIn | LinkedIn login password |
APPLYR_API_KEY | Yes | Shared API key (must match Arcturus-Prime) |
RESUME_PATH | Yes | Path to .docx resume |
RESUME_PDF_PATH | Yes | Path to .pdf resume |
APPLYR_DATA_DIR | No | Data directory (default: ./data) |
Arcturus-Prime (.env)
| Variable | Required | Description |
|---|---|---|
AUTOAPPLY_API_KEY | Yes | Must match APPLYR_API_KEY |
JOBS_API_URL | Yes | Applyr service URL (e.g., http://10.42.0.100:8585) |
Application Intelligence Center
The audit page (/admin/applyr-audit) provides a comprehensive view of every detail Applyr sends to employers. It replaces the basic audit viewer with a full intelligence interface for reviewing applications, preparing for interviews, and troubleshooting issues.
Stats Bar
The top bar shows aggregate counts across all audited applications: Total, Submitted, Queued, External, Failed, Skipped. Each counter is color-coded to match the status pill colors.
Search & Filtering
A toolbar below the stats bar provides:
- Text search — filters by job title or company name (instant, client-side)
- Status filter — dropdown for submitted/queued/external/failed/skipped
- Platform filter — LinkedIn or Indeed
- Sort — newest first, oldest first, highest score, company A-Z
List View
Each application card shows the job title, company, platform, score, apply method, and time ago. Completeness indicators show whether the application has a cover letter, screening answers, and audit trail data. Error badges appear when the application has an error message.
Detail View — 9 Tabs
Clicking an application loads its full data via the combined full-audit endpoint (single API call) and presents it across nine tabs:
| Tab | Content |
|---|---|
| Employer Brief | Summary card with what was sent: key answers, selections made, cover letter excerpt. “Copy Briefing” button formats a plain-text summary for interview prep. |
| Timeline | Chronological event log from application start through form steps, field fills, and final result. Each event shows a timestamp and context. |
| Fields | All field interactions grouped by form step in collapsible sections. For select/radio fields: shows “Chose X from [A, B, C, D]” format. Source badges: default (green), AI (purple), vision (cyan), fallback (amber), prefilled (gray). |
| Answers | Q&A pairs merged from two sources: audit trail field interactions (text/textarea types) and the database answers_json column. Deduplicated. “Copy All” button. |
| Cover Letter | Full text with character count, generation timestamp, and copy button. |
| Resume | Which resume was used (master vs tailored), filename. If a tailored resume exists, shows summary and skills. |
| Evidence | Inline screenshots from each form step (served via the evidence-file endpoint). Click to expand in lightbox overlay. HTML snapshots open in new tab. |
| Errors | Error cards with message, context, step number, and timestamp. Shows a green “no errors” message when clean. |
| Raw | Full JSON audit data with copy button. |
Evidence File Serving
Screenshots and HTML snapshots are stored in data/evidence/YYYY-MM-DD/audit_{company}_{timestamp}/ directories. The API serves them via GET /api/jobs/evidence-file/{app_id}/{filename} with path traversal protection. The Arcturus-Prime proxy passes binary responses through using arrayBuffer() instead of text() to avoid corrupting image data.
Real-Time Logs
The Logs tab streams engine activity in real-time via polling (/api/jobs/logs). Log entries include:
- Session start/stop events
- Per-job progress updates (job title, company, score)
- Apply results (submitted, failed, queued, skipped)
- Error details with stack context
- CAPTCHA detection and resolution events
- Warmup activity
The log buffer holds the last 500 entries in memory. Session logs are also persisted to disk at data/session_logs/.