Candidate-Profiling-and-Optimization-Tool

October 15, 2025

Candidate-Profiling-and-Optimization-Tool

An end-to-end Streamlit application that evaluates one or more resumes against a Job Description (JD), computes an ATS-style score, pinpoints gaps & improvements, and can produce a truthful, ATS-friendly rewritten resume aligned to the JD.

⚑ Built for macOS + PyCharm, but works anywhere Python runs.

Table of Contents

1.Features
2.Architecture
3.Tech Stack
4.Project Structure
5.Prerequisites
6.Quickstart
7.Configuration
8.Usage
9.Scoring Logic
10.Security & Privacy
11.Troubleshooting
12.Roadmap
13.FAQ
14.Contributing
15.License

1.Features
1.Multi-resume upload (PDF/DOCX/TXT) + single JD upload.
2.ATS score (0–100) for each resume vs. JD.
3.Ranked results with matched items, gaps, and rationale.
4.Actionable improvements mapped to specific JD items (skills/tools/years/responsibilities).
5.Truthful rewrite: generate an optimized final resume (TXT/DOCX) aligned to the JD, without fabricating details.
6.RAG + Prompt Chaining: grounded on retrieved resume chunks for reliability.
7.Optional Pinecone vector DB; defaults to in-memory retrieval.
8.Clear error handling and strict JSON output enforcement from the model.

2.Architecture
┌────────────┐
JD ─────► │ JD Parser │ ─────────┐
└────────────┘ │

JD Requirements (JSON)


Resumes ─────► Parse & Chunk ──► Embed ──► (Pinecone|Local) ──► Retrieve Top-K
│ ▲
▼ │
GPT-4o Scoring Chain ───────────────┘
│ │
│ └─► Gaps / Matched / Rationale

GPT-4o Improvements Chain

└─► Targeted Suggestions (where + how to fix)


GPT-4o Truthful Rewrite Chain

└─► Optimized Resume (TXT/DOCX)

3.Tech Stack

Frontend: Streamlit
LLM: OpenAI GPT-4o (chat.completions with response_format={“type”:”json_object”})
Embeddings: OpenAI text-embedding-3-large
RAG: in-memory cosine similarity or Pinecone (optional)
Parsing: pypdf, docx2txt
Packaging: python-docx for DOCX exports
Validation: pydantic for schema checks

  1. Project Structure
    genetix-ats/
    ├─ app.py # Streamlit app (UI + orchestration)
    ├─ parsing.py # Parse PDF/DOCX/TXT into text
    ├─ rag.py # Chunking, embeddings, retrieval (local/Pinecone)
    ├─ scoring.py # GPT chains: JD extract, score, improve, rewrite
    ├─ prompts.py # System/user prompts for each chain
    ├─ utils.py # OpenAI client, JSON utils, DOCX export
    ├─ requirements.txt
    ├─ .env.example
    └─ README.md # this file

5.Prerequisites
Python 3.10+
An OpenAI API key with access to GPT-4o and text-embedding-3-large
(Optional) Pinecone API key if you want cloud vector search
macOS + PyCharm is the primary target environment, but VS Code / CLI works fine.

6.Quickstart
git clone https://github.com//.git
cd genetix-ats

python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate

python -m pip install –upgrade pip
python -m pip install -r requirements.txt

# Create your .env from template
cp .env.example .env
# Edit .env and set OPENAI_API_KEY, optionally Pinecone keys

streamlit run app.py

Open the URL Streamlit prints (usually http://localhost:8501
).

7.Configuration

Create a .env (or set env vars in PyCharm):

# REQUIRED
OPENAI_API_KEY=sk-…

# OPTIONAL (enable Pinecone by toggling in the UI)
PINECONE_API_KEY=pcn-…
PINECONE_ENV=us-east-1
PINECONE_INDEX=genetix-ats

Security: Do not commit .env to source control.

8.Usage

Start the app: streamlit run app.py
Upload
Job Description: 1 file (PDF/DOCX/TXT)
Resumes: 1+ files (PDF/DOCX/TXT)
Click Analyze

Review:
Extracted JD requirements
Ranking table (ATS scores)
Per-resume panel with:
Matched items
Gaps (missing/weak areas)
Actionable improvements (section + concrete rewrite)
Evidence (top retrieved chunks)

Click Generate optimized resume inside a resume panel to get a truthful, JD-aligned resume.
Download as TXT or DOCX.

9.Scoring Logic

Extraction: The JD is converted into a structured JSON (role, must-have skills, tools, responsibilities, education, certs).
Retrieval: Resumes are parsed, chunked, embedded, and top-K relevant chunks are retrieved per JD.
Scoring (0–100): GPT-4o compares the JD vs. retrieved resume evidence:
Coverage of must-have skills/years/tools
Alignment to responsibilities
Education/certifications as applicable
Improvements: Targeted, measurable suggestions referencing where to edit (e.g., “Experience: Company X 2022–2024”).
Rewrite: Produces a clean, ATS-friendly resume (Summary, Skills, Experience, Projects, Education, Certs), no fabrication.

10.Security & Privacy

Truthfulness: The rewrite chain forbids fabrication of titles, employers, dates, degrees, or achievements.
Data handling: Input files are processed in memory. No data is persisted unless you add storage.
PII: Treat resumes as sensitive. Clear uploaded files after use if you extend persistence.
API keys: Use environment variables; never hardcode secrets.

11.Troubleshooting
Common
json.decoder.JSONDecodeError:
We enforce response_format={“type”:”json_object”} and sanitize JSON, but if the LLM returns unexpected content, the UI will display the raw error. Re-run.

“No text extracted”:
Some PDFs are image-only. Add OCR (e.g., Tesseract) if needed.

Empty/weak improvements:
Ensure the JD is detailed; increase Top-K and chunk size in the sidebar; consider enabling Pinecone.

Environment
Verify Python ≥ 3.10
pip install -r requirements.txt
Confirm OPENAI_API_KEY is loaded (PyCharm Run Config or .env)

12.Roadmap

Weighted scoring editor (e.g., must-haves 60%, tools 25%, responsibilities 15%)
OCR support for image-based PDFs
Export PDF and structured JSON reports per resume
Dataset-driven evaluation and A/B testing of prompts
Optional audit log & admin dashboard

13.FAQ
Q: Is this a “real ATS”?
A: It’s an AI evaluator that simulates ATS-style scoring with grounded retrieval and GPT-4o reasoning. It’s not a full Applicant Tracking System.

Q: Will it make up experience to fit the JD?
A: No. The rewrite chain forbids fabrication. It reorganizes, tightens, and clarifies based on what’s already in the resume.

Q: Do I need Pinecone?
A: No. Pinecone is optional. Local retrieval is the default and fast for small batches.

Q: Can I deploy this?
A: Yes—Streamlit Cloud, a simple VM, or Docker/ECS. Ensure secrets are in env vars and add HTTPS at the edge.

14.Contributing
Fork the repo
Create a feature branch: git checkout -b feat/
Commit changes: git commit -m “feat: “
Push: git push origin feat/

Open a Pull Request
Before submitting, please run:

python -m pip install -r requirements.txt
# (optional) add linters/formatters as you like

15.License
This project is licensed under the MIT License. See LICENSE
for details.

Appendix: Key Files (high-level)

app.py — Streamlit UI, upload handling, orchestration of chains, rendering results, downloads

parsing.py — Robust parsing for PDF/DOCX/TXT

rag.py — Chunking, embeddings, cosine similarity; optional Pinecone index & query

prompts.py — Strict, JSON-only prompts for JD extraction, scoring, improvements, and truthful rewrite

scoring.py — LLM calls with JSON sanitation, schema validation, retries; rewrite generation

utils.py — OpenAI client init, JSON utilities, DOCX export helper# Candidate-Profiling-and-Optimization-Tool
A system designed to analyze and optimize candidate profiles by assessing key skills, experience, and compatibility with job requirements. It helps recruiters streamline hiring decisions and identify the best-fit candidates efficiently.

Tags: , , , , , , , , ,

Leave a comment

Your email address will not be published. Required fields are marked *