Unredact

C 65 completed
Api
monorepo / text · small
109
Files
38,843
LOC
3
Frameworks
10
Languages

Pipeline State

completed
Run ID
#355648
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Cataloged
Decision
proceed
Novelty
78.00
Framework unique
Isolation
Last stage change
2026-05-10 03:35:31
Deduplication group #55946
Member of a group with 1 similar repo(s) — this repo is canonical view group →
Top concepts (2)
Project DescriptionWeb Backend
Same scanner, your repo: https://repobility.com — Repobility

AI Prompt

Create a web-based AI tool called 'unredact' that analyzes redacted PDF files directly in the browser. The tool should use computer vision and LLM reasoning to assess and reconstruct hidden content by finding possible words or names that fit the redacted areas. Users should be able to upload a PDF, and the interface must display the proposed guesses for the hidden text, allowing visual comparison against the original PDF. Since this runs client-side, ensure no files are sent to a server.
ai pdf computer-vision llm web-app python javascript browser-based text-analysis
Generated by gemma4:latest

Catalog Information

An AI tool that analyzes redacted PDFs to assess and reconstruct hidden content using computer vision, constraint solving, and large language model reasoning.

Description

Unredact is an AI-powered service that examines PDFs containing redacted sections. It uses computer vision to locate blacked-out areas, applies constraint-solving techniques to infer possible underlying text, and leverages large language models to interpret and reconstruct the content. The tool produces detailed reports on the adequacy of redactions, highlights potential leaks, and offers recommendations for improvement. It is designed for compliance teams, legal professionals, and forensic analysts who need automated, reliable redaction verification. By combining multiple AI disciplines, Unredact reduces manual effort and mitigates the risk of sensitive data exposure.

الوصف

يُقدّم هذا المشروع أداة تحليل مستندات PDF التي تحتوي على مناطق محذوفة، باستخدام تقنيات الرؤية الحاسوبية لتحديد المناطق المحذوفة، وحل قيود رياضية لتقدير المحتوى المحتمل، وتوظيف نماذج اللغة الكبيرة لتفسير النص المسترجع. يتيح للمستخدمين فحص مدى كفاءة عمليات الحذف والتأكد من عدم وجود معلومات حساسة غير محذوفة. كما يُمكنه توليد تقارير تفصيلية تُظهر نقاط الضعف في الحذف وتوصيات لتحسينه. تُستهدف هذه الأداة فرق الامتثال، والفرق القانونية، ومحللي الأدلة الجنائية الذين يحتاجون إلى تقييم دقيق للوثائق المحذوفة. يحل المشروع مشكلة صعوبة التحقق اليدوي من جودة الحذف في المستندات الكبيرة، ويقلل من الأخطاء البشرية. يميز نفسه بدمج نهج متعدد التخصصات يجمع بين الذكاء الاصطناعي، وحل القيود، ومعالجة اللغة الطبيعية لتقديم تحليل شامل.

Novelty

8/10

Tags

redaction-analysis pdf-processing ai-reconstruction compliance-verification data-privacy forensic-analysis llm-reasoning constraint-solving

Technologies

anthropic fastapi numpy pandas scipy uvicorn

Claude Models

claude-opus-4.6

Quality Score

C
65.0/100
Structure
71
Code Quality
63
Documentation
60
Testing
60
Practices
59
Security
82
Dependencies
60

Strengths

  • Good test coverage (45% test-to-source ratio)
  • Code linting configured (ruff (possible))
  • Consistent naming conventions (snake_case)
  • Good security practices \u2014 no major issues detected
  • Properly licensed project

Weaknesses

  • No CI/CD configuration \u2014 manual testing and deployment
  • 1 files with critical complexity need refactoring
  • Potential hardcoded secrets in 1 files
  • 1975 duplicate lines detected \u2014 consider DRY refactoring
  • 3 'god files' with >500 LOC need decomposition

Recommendations

  • Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
  • Move hardcoded secrets to environment variables or a secrets manager

Security & Health

7.6h
Tech Debt (A)
A
OWASP (100%)
PASS
Quality Gate
A
Risk (0)
Repobility — the code-quality scanner for AI-generated software · https://repobility.com
MIT
License
5.2%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

text
61.2%
python
17.2%
javascript
9.2%
rust
7.7%
css
2.7%
json
0.9%
html
0.7%
markdown
0.2%
toml
0.2%
shell
0.1%

Frameworks

FastAPI Axum pytest

Concepts (2)

All metrics by Repobility · https://repobility.com
CategoryNameDescriptionConfidence
All rows scored by the Repobility analyzer (https://repobility.com)
auto_descriptionProject Description![Download unredact](https://github.com/lolly6996/unredact/releases)80%
auto_categoryWeb Backendweb-backend70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/79774.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV