
An end-to-end Streamlit application that evaluates one or more resumes against a Job Description (JD), computes an ATS-style score, pinpoints gaps & improvements, and can produce a truthful, ATS-friendly rewritten resume aligned to the JD.
⚑ Built for macOS + PyCharm, but works anywhere Python runs.
Table of Contents
1.Features
2.Architecture
3.Tech Stack
4.Project Structure
5.Prerequisites
6.Quickstart
7.Configuration
8.Usage
9.Scoring Logic
10.Security & Privacy
11.Troubleshooting
12.Roadmap
13.FAQ
14.Contributing
15.License
1.Features
1.Multi-resume upload (PDF/DOCX/TXT) + single JD upload.
2.ATS score (0–100) for each resume vs. JD.
3.Ranked results with matched items, gaps, and rationale.
4.Actionable improvements mapped to specific JD items (skills/tools/years/responsibilities).
5.Truthful rewrite: generate an optimized final resume (TXT/DOCX) aligned to the JD, without fabricating details.
6.RAG + Prompt Chaining: grounded on retrieved resume chunks for reliability.
7.Optional Pinecone vector DB; defaults to in-memory retrieval.
8.Clear error handling and strict JSON output enforcement from the model.
2.Architecture
┌────────────┐
JD ─────► │ JD Parser │ ─────────┐
└────────────┘ │
▼
JD Requirements (JSON)
│
│
Resumes ─────► Parse & Chunk ──► Embed ──► (Pinecone|Local) ──► Retrieve Top-K
│ ▲
▼ │
GPT-4o Scoring Chain ───────────────┘
│ │
│ └─► Gaps / Matched / Rationale
▼
GPT-4o Improvements Chain
│
└─► Targeted Suggestions (where + how to fix)
│
▼
GPT-4o Truthful Rewrite Chain
│
└─► Optimized Resume (TXT/DOCX)
3.Tech Stack
Frontend: Streamlit
LLM: OpenAI GPT-4o (chat.completions with response_format={“type”:”json_object”})
Embeddings: OpenAI text-embedding-3-large
RAG: in-memory cosine similarity or Pinecone (optional)
Parsing: pypdf, docx2txt
Packaging: python-docx for DOCX exports
Validation: pydantic for schema checks
5.Prerequisites
Python 3.10+
An OpenAI API key with access to GPT-4o and text-embedding-3-large
(Optional) Pinecone API key if you want cloud vector search
macOS + PyCharm is the primary target environment, but VS Code / CLI works fine.
6.Quickstart
git clone https://github.com//.git
cd genetix-ats
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
python -m pip install –upgrade pip
python -m pip install -r requirements.txt
# Create your .env from template
cp .env.example .env
# Edit .env and set OPENAI_API_KEY, optionally Pinecone keys
streamlit run app.py
Open the URL Streamlit prints (usually http://localhost:8501
).
7.Configuration
Create a .env (or set env vars in PyCharm):
# REQUIRED
OPENAI_API_KEY=sk-…
# OPTIONAL (enable Pinecone by toggling in the UI)
PINECONE_API_KEY=pcn-…
PINECONE_ENV=us-east-1
PINECONE_INDEX=genetix-ats
Security: Do not commit .env to source control.
8.Usage
Start the app: streamlit run app.py
Upload
Job Description: 1 file (PDF/DOCX/TXT)
Resumes: 1+ files (PDF/DOCX/TXT)
Click Analyze
Review:
Extracted JD requirements
Ranking table (ATS scores)
Per-resume panel with:
Matched items
Gaps (missing/weak areas)
Actionable improvements (section + concrete rewrite)
Evidence (top retrieved chunks)
Click Generate optimized resume inside a resume panel to get a truthful, JD-aligned resume.
Download as TXT or DOCX.
9.Scoring Logic
Extraction: The JD is converted into a structured JSON (role, must-have skills, tools, responsibilities, education, certs).
Retrieval: Resumes are parsed, chunked, embedded, and top-K relevant chunks are retrieved per JD.
Scoring (0–100): GPT-4o compares the JD vs. retrieved resume evidence:
Coverage of must-have skills/years/tools
Alignment to responsibilities
Education/certifications as applicable
Improvements: Targeted, measurable suggestions referencing where to edit (e.g., “Experience: Company X 2022–2024”).
Rewrite: Produces a clean, ATS-friendly resume (Summary, Skills, Experience, Projects, Education, Certs), no fabrication.
10.Security & Privacy
Truthfulness: The rewrite chain forbids fabrication of titles, employers, dates, degrees, or achievements.
Data handling: Input files are processed in memory. No data is persisted unless you add storage.
PII: Treat resumes as sensitive. Clear uploaded files after use if you extend persistence.
API keys: Use environment variables; never hardcode secrets.
11.Troubleshooting
Common
json.decoder.JSONDecodeError:
We enforce response_format={“type”:”json_object”} and sanitize JSON, but if the LLM returns unexpected content, the UI will display the raw error. Re-run.
“No text extracted”:
Some PDFs are image-only. Add OCR (e.g., Tesseract) if needed.
Empty/weak improvements:
Ensure the JD is detailed; increase Top-K and chunk size in the sidebar; consider enabling Pinecone.
Environment
Verify Python ≥ 3.10
pip install -r requirements.txt
Confirm OPENAI_API_KEY is loaded (PyCharm Run Config or .env)
12.Roadmap
Weighted scoring editor (e.g., must-haves 60%, tools 25%, responsibilities 15%)
OCR support for image-based PDFs
Export PDF and structured JSON reports per resume
Dataset-driven evaluation and A/B testing of prompts
Optional audit log & admin dashboard
13.FAQ
Q: Is this a “real ATS”?
A: It’s an AI evaluator that simulates ATS-style scoring with grounded retrieval and GPT-4o reasoning. It’s not a full Applicant Tracking System.
Q: Will it make up experience to fit the JD?
A: No. The rewrite chain forbids fabrication. It reorganizes, tightens, and clarifies based on what’s already in the resume.
Q: Do I need Pinecone?
A: No. Pinecone is optional. Local retrieval is the default and fast for small batches.
Q: Can I deploy this?
A: Yes—Streamlit Cloud, a simple VM, or Docker/ECS. Ensure secrets are in env vars and add HTTPS at the edge.
14.Contributing
Fork the repo
Create a feature branch: git checkout -b feat/
Commit changes: git commit -m “feat: “
Push: git push origin feat/
Open a Pull Request
Before submitting, please run:
python -m pip install -r requirements.txt
# (optional) add linters/formatters as you like
15.License
This project is licensed under the MIT License. See LICENSE
for details.
Appendix: Key Files (high-level)
app.py — Streamlit UI, upload handling, orchestration of chains, rendering results, downloads
parsing.py — Robust parsing for PDF/DOCX/TXT
rag.py — Chunking, embeddings, cosine similarity; optional Pinecone index & query
prompts.py — Strict, JSON-only prompts for JD extraction, scoring, improvements, and truthful rewrite
scoring.py — LLM calls with JSON sanitation, schema validation, retries; rewrite generation
utils.py — OpenAI client init, JSON utilities, DOCX export helper# Candidate-Profiling-and-Optimization-Tool
A system designed to analyze and optimize candidate profiles by assessing key skills, experience, and compatibility with job requirements. It helps recruiters streamline hiring decisions and identify the best-fit candidates efficiently.