Why Job Posting Data Matters
Companies telegraph their strategic priorities through job postings 6–12 months before they show up in financial results. A company posting 80 AI engineering roles is pivoting to AI. One posting 80 sales roles is pushing revenue. One filing WARN Act notices is cutting headcount before earnings. These are all publicly available signals that most investors never systematically track.
Three Data Layers
Layer 1: Job Postings (Indeed/Greenhouse/Lever)
- Track total open roles per company weekly
- Break down by department: Engineering, Sales, Marketing, Operations
- Flag 20%+ increases in any category week-over-week
- Greenhouse and Lever have public job board JSON APIs (no key required for public boards)
Layer 2: WARN Act Filings (Free, Public)
- Worker Adjustment and Retraining Notification Act requires 60-day notice of mass layoffs
- Filed with state labor departments, most are publicly accessible
- Signals: layoffs before they're announced, geographic concentration of cuts
Layer 3: LinkedIn (Signal Only, No Scraping)
- LinkedIn's Terms of Service prohibit scraping. Use LinkedIn as a manual validation layer only.
- The HiQ vs LinkedIn case established some protections for public data, but ToS violations remain a risk.
HEARTBEAT Configuration
name: job_intel_monitor
schedule: "0 9 * * 1"
steps:
- fetch_jobs:
companies:
- name: "Stripe"
greenhouse_board: "stripe"
- name: "Notion"
greenhouse_board: "notion"
departments:
- Engineering
- Sales
- Marketing
- Operations
- compare:
to: last_week
flag_if_change_pct_exceeds: 20
- fetch_warn:
states: ["CA", "NY", "TX", "WA"]
since: "{{ 30_days_ago }}"
- llm:
prompt: |
Analyze these job posting trends and WARN filings.
Identify: companies hiring aggressively in specific departments,
companies showing headcount contraction, any WARN filings for watchlist companies.
Data: {{ job_data }} {{ warn_data }}
- notify:
subject: "💼 Weekly Hiring Intel — {{ date }}"
Greenhouse API Implementation
Fetch Public Job Board
import httpx
def get_greenhouse_jobs(board_token: str) -> list:
"""Fetch all open jobs from a Greenhouse job board (public, no auth required)."""
url = f"https://boards-api.greenhouse.io/v1/boards/{board_token}/jobs"
r = httpx.get(url, headers={"User-Agent": "AltDataBot contact@youremail.com"})
data = r.json()
jobs = data.get("jobs", [])
return [{
"id": j["id"],
"title": j["title"],
"department": j.get("departments", [{}])[0].get("name", "Unknown"),
"location": j.get("location", {}).get("name", ""),
"updated_at": j.get("updated_at", "")
} for j in jobs]
def count_by_department(jobs: list) -> dict:
from collections import Counter
return dict(Counter(j["department"] for j in jobs))
WARN Act Data Fetching
def fetch_california_warn(since_days: int = 30) -> list:
"""
California WARN notices are published at edd.ca.gov.
This fetches the public CSV dataset.
"""
url = "https://www.edd.ca.gov/jobs_and_training/warn/WARN-Report.xlsx"
# Note: check edd.ca.gov for current URL — they update periodically
import httpx, io
import pandas as pd
r = httpx.get(url)
df = pd.read_excel(io.BytesIO(r.content))
df["Notice Date"] = pd.to_datetime(df["Notice Date"], errors="coerce")
cutoff = pd.Timestamp.now() - pd.Timedelta(days=since_days)
return df[df["Notice Date"] > cutoff].to_dict("records")
Frequently Asked Questions
LinkedIn's ToS prohibits scraping. Use their official APIs or manual validation only.
Visit jobs.greenhouse.io/{company-name} — many companies use Greenhouse for public job boards.
By law, companies must file 60 days before a layoff. But some companies pay fines rather than comply. They're a signal, not a guarantee.
Next Steps
Now that you can track hiring momentum and layoff signals, move to Part 3: Reddit & News Sentiment to monitor how social conversations and news tone shift in response to these hiring and layoff events.