AI-Verified Economic Analytics Pipeline

Macro Economic Dashboard

This project is a full analytics pipeline: API ingestion, relational modeling, statistical adjustment, machine learning, NLP, AI narrative generation with programmatic fact-checking, and an interactive agent that lets visitors query the database in plain English. It runs on FRED economic data and Hacker News labor-market discussions, and it ships as a Streamlit dashboard backed by a SQLite star schema.

I built it because my prior work at Ringer Sciences covered NLP and ML but had no public SQL project and no public dashboard. Most analytics roles ask for both, so this is the piece that fills that gap.

Article Source Code

Code for the project can be found here

Dashboard

The thesis

The information sector has been losing jobs per capita since 2022. Specialty trades keep growing. Power output is climbing. The yield curve sat inverted for over two years. I wanted to know whether these signals line up or whether I was reading patterns into noise.

I pulled 10 FRED series and 1,500+ Hacker News stories to examine it from two angles: the macro numbers (recession indicators, employment divergence, inflation) and the text (what tech workers are talking about, and whether their sentiment tracks the employment data).

The dashboard has five sections. The first is a conversational interface where visitors ask questions and get verified answers from the database. The rest cover macro overview, recession risk modeling, NLP topic analysis, and a deep dive into AI's labor market effects. There are 21 AI-generated narrative insights, each fact-checked against the source database, a recession probability model, an NLP topic model over HN stories, and 573 tests backing the whole thing.

Data sources

FRED was the natural choice. I have an econometrics degree, it's the standard source for U.S. macro data, and the API is free with generous rate limits. The 10 series I picked map cleanly into a star schema, so I could focus on analysis rather than wrestling with pagination across multiple endpoints.

Hacker News was a pivot. I originally planned to use Reddit, but Reddit tightened API access in late 2025 and new script-type apps no longer receive working tokens. The workarounds (browser-session extraction, reverse-engineered tokens) would break the project's "works on clone" requirement, so I dropped it.

HN's Algolia search API has no auth requirement and has been stable for over a decade. The tech-practitioner audience carries the same labor-sentiment signal I was after. I pulled stories matching layoff, AI jobs, and career themes from January 2022 onward, scored them with a RoBERTa sentiment model, and grouped them into topics with NMF.

Findings

Per-capita normalization turned out to be the most consequential methodological choice in the project.

Raw info-sector employment looks flat. After dividing by working-age population, it fell 7.2%. Specialty trades grew 13.5% on the same basis. That's a 20+ point divergence that's invisible without the normalization step.

The yield curve inverted in 26 of the months tracked. Every U.S. recession since the 1970s was preceded by an inversion. Inflation hit 37% cumulative over the dataset window. Headline unemployment sits near 4.4%, but U6 (which includes discouraged and involuntary part-time workers) runs 3.3 points higher. That gap has stayed wide since 2020.

Electric power output is up 8.5% since ChatGPT launched in November 2022.

On the NLP side, "Software Engineering Careers" is the dominant HN topic at 585 stories. The most negative topic by sentiment is "Executive Firings & Restructuring." Layoff story volume and the U6-U3 unemployment gap move together across the 2022-2026 window. That co-movement is suggestive, not causal, but I thought it was worth surfacing.

Claim verification

This is what I'd lead with in a technical interview. The core problem: LLMs get numbers wrong. In early iterations where the model computed values itself, about 15% of claims failed verification. The fix was straightforward. Stop asking it to do math.

Step 1: Python pre-computes the claims. For each of the 21 analytical slices, a script queries the database and builds 2-4 verifiable claims. "USINFO changed -3.2% between 2025-04 and 2026-03." "The yield curve was inverted in 26 months." These are real database query results. The model never touches them.

Step 2: The LLM writes prose only. The pre-computed claims and their data context go to llama3.1:8b through Ollama, running locally. The model's job is to write readable paragraphs around the numbers. It doesn't compute anything.

Step 3: Independent verification. A separate script re-queries the database for every claim and compares expected vs actual. Tolerances are generous (5% relative or 0.5 absolute for values, sign-match for trends) because the goal is catching hallucinations, not flagging rounding differences.

The dashboard shows a badge on each insight block: green if all claims pass, orange if some fail (with "X of Y confirmed"), red if none pass. A "Show sources" panel inside each block shows every claim, the expected value, the actual value, and whether they matched.

21 insights ship in the seed database. All 21 pass verification. The demo works without Ollama installed.

Ask the Data

The 21 batch insights answer the questions I chose. Ask the Data lets visitors ask their own.

It's a LangGraph ReAct agent with two tools: a read-only SQL tool that queries the database directly, and a RAG retrieval tool that pulls context from the same vector store used for batch insight generation. The agent decides which tool to call (or both), produces an answer, and then a post-processing step runs the same verification pipeline described above. The green/orange/red badges and the "Show sources" panel work identically whether the insight was pre-computed or generated live.

The SQL tool is sandboxed. A regex filter rejects mutations (INSERT, UPDATE, DELETE, DROP), the connection opens in read-only mode, and results are capped at 100 rows. The agent gets up to 2 SQL round-trips per question before it has to synthesize.

On Streamlit Cloud, the agent runs against Anthropic's API (Claude Haiku). Locally it defaults to Ollama. The cloud deployment has an access key gate: visitors enter a key validated with hmac.compare_digest, and the session unlocks for 2 minutes before requiring re-entry. Keys live in Streamlit secrets, never in the repo.

I placed Ask the Data at the top of the dashboard. It's the first thing visitors see. The pre-computed insights are still there throughout, but the interactive agent is the stronger demonstration of how the pipeline components connect.

COVID adjustment

COVID broke every rolling-window calculation in the dataset. Unemployment went from 3.5% to 14.8% in a single month. A 12-month YoY window touching April 2020 produces swings of +300% and -58% that dominate the charts for two full years.

For each series, I fit an ARIMA model on pre-COVID data and used the forecast as a counterfactual for March 2020 through January 2022, with a 3-month taper blending back to actual values. The raw data stays in the value column. The adjusted version goes in value_covid_adjusted. Every query uses the adjusted column except the COVID recovery chart, which shows the real shock intentionally.

The adjustment didn't change any conclusions. It made the charts readable and the rolling calculations meaningful.

Per-capita normalization

Raw employment numbers grow partly because the U.S. working-age population grows about 0.5% per year. Comparing specialty trades employment of 4,256k in 2016 to 5,244k in 2026 overstates the real expansion because some of that growth is just more people.

USINFO and CES2023800001 are divided by CNP16OV (civilian noninstitutional population 16+) to get employees per 1,000 working-age persons, then indexed to 100 at the start date.

Before normalization, the information sector shows index 101. After: 93. That's the difference between "the sector barely moved" and "it shrank 7.2% relative to population." I spent more time on this one methodology step than on any individual chart, and it's the reason the project's central finding exists.

Recession model

A logistic regression and random forest trained on 11 FRED-derived features (yield spread, unemployment change, GDP growth, CPI momentum, employment ratios) plus 3 Hacker News features (rolling sentiment, story volume, layoff frequency). The model outputs a monthly recession probability between 0 and 1, stored in the recession_predictions table.

The HN features have near-zero importance in the shipped model. The pre-2022 training period has no HN data, so those months get filled with training-period medians. That constant fill dilutes whatever signal exists in the 24 post-2022 months. I kept them in rather than dropping weak features, because removing inputs to improve apparent performance felt like overfitting to what the model can't see yet.

The Recession Risk tab shows a probability timeline, a feature snapshot with directional signals, and a What If scenario explorer where visitors can adjust sliders and see how the risk score responds.

NLP topic modeling

I ran sklearn's NMF (Non-negative Matrix Factorization) over 1,547 HN story titles and excerpts to extract 8 topics. I tested values from 6 to 10 and chose 8 because it produced the most distinct clusters without fragmenting related themes.

One issue I had to address: non-AI proper nouns (Musk, Twitter, Tesla, Meta, Facebook) were creating personality-driven topics rather than labor-theme topics. I added those as stop words. AI companies and figures (OpenAI, Altman) stayed in the vocabulary because they're part of the thesis.

The NLP Analysis section has four charts:

Topic distribution over time (stacked area, shows how the conversation shifted)
Sentiment by topic (box plot, which themes carry the most negative tone)
Layoff story volume vs the U6-U3 gap (dual-axis, tests whether HN activity tracks macro slack)
Topic sentiment vs USINFO per-capita employment (dual-axis, tests whether sentiment tracks actual jobs)

Monthly bigram frequencies are pre-computed and shown as a quarterly heatmap in an expander.

RAG citations

Each AI insight pulls context from a vector store before generation. FRED series metadata and curated U.S. federal publications (BEA, EIA, CEA reports, all public domain) are chunked at sentence boundaries, embedded with sentence-transformers, and stored in ChromaDB. The top-k chunks for each analytical slice get injected into the prompt as reference context.

The LLM is instructed to cite these with [ref:N] tags. It doesn't. llama3.1:8b ignores the citation instruction consistently. The sources displayed in the "Show sources" panel all come from an auto-attach mechanism that surfaces retrieved chunks regardless of whether the model cited them. It works for the reader, but it's a workaround. The intended design was model-driven citation, and that part didn't work with this model.

Architecture

FRED API + HN Algolia API
        |
  data_pull.py + hackernews_pull.py
        |
  sentiment_score.py
        |
  db_setup.py  ->  covid_adjustment.py  ->  topic_model.py
        |
  export_csv.py  ->  embed_references.py  ->  recession_model.py
        |
  ai_insights.py  ->  verify_insights.py
        |
  dashboard/app.py  <->  agent/ (LangGraph ReAct: SQL tool + RAG tool + verification)

SQLite with a star schema. Main tables:

Table	Role
`series_metadata`	Display names, categories, units for each FRED series
`observations`	Raw values and ARIMA-adjusted values side by side
`ai_insights`	Narratives, pre-computed claims, verification results, RAG citations
`recession_predictions`	Monthly probability scores, feature snapshots, model metadata
`hn_stories`	1,547 HN stories with sentiment scores and topic assignments
`hn_topics`	8 NMF topics with labels and top terms
`hn_ngram_monthly`	520 monthly bigram frequency rows
`reference_docs`	FRED metadata, scholarly docs, and HN social refs for RAG

Two modes: seed (default, everything pre-computed, works on clone with no API calls) and full (live pull, requires a free FRED API key).

Analysis queries

Eight SQL queries using CTEs, window functions, joins, and per-capita normalization:

Query	Question
Q1	Yield curve inversions vs unemployment (T10Y2Y monthly avg + UNRATE with 12-month lag)
Q2	Info vs trades divergence, per-capita normalized, indexed to 100
Q3	GDP annualized growth with NBER recession shading
Q4	Rolling 12-month per-capita employment growth by sector
Q5	COVID recovery comparison (raw values, the one exception to adjusted data)
Q6	U6 vs U3 unemployment gap
Q7	Electric power output vs info employment
Q8	CPI inflation month-over-month and year-over-year

Quick start

git clone https://github.com/ShameekConyers/sql_python_dashboard.git
cd sql_python_dashboard
python3 -m venv .venv
.venv/bin/pip install -r requirements-dev.txt
.venv/bin/streamlit run dashboard/app.py

No API key needed. The seed database ships with all 10 FRED series, 1,547 HN stories, 8 NMF topics, recession predictions, and 21 verified AI insights.

Tools: Python, SQL/SQLite, pandas, scikit-learn, pmdarima, Plotly, Streamlit, LangGraph, langchain, sentence-transformers, ChromaDB, Ollama, FRED API, Algolia HN API