Pdf To Xls Vision

D 52 completed
Other
cli / python · tiny
21
Files
2,040
LOC
0
Frameworks
4
Languages

Pipeline State

completed
Run ID
#396328
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Skipped
Decision
skip_scaffold_dup
Novelty
19.14
Framework unique
Isolation
Last stage change
2026-04-16 18:15:42
Deduplication group #47306
Member of a group with 1 similar repo(s) — canonical #110889 view group →
Top concepts (2)
Project DescriptionLibrary
Want fix-PRs on findings? Install Repobility's GitHub App · github.com/apps/repobility-bot

AI Prompt

Build me a command-line tool in Python that can intelligently convert PDF files containing tables into Excel (XLSX) format. The tool needs to handle both text-based and image-based PDFs, using the Claude Vision API for the latter. Key features should include automatic rotation detection and correction, and the ability to merge tables that span multiple pages into single sheets. I also need it to support batch processing of entire directories and generate a detailed Markdown report comparing extracted numbers with the source PDF. Please ensure it can process image files like JPG and PNG directly.
python cli pdf excel claude-vision automation data-extraction image-processing
Generated by gemma4:latest

Catalog Information

An intelligent Python library to convert PDF files containing tables into Excel (XLSX) files using Claude Vision API with automatic rotation detection. Each table found in the PDF becomes a separate sheet in the output Excel file.

Description

An intelligent Python library to convert PDF files containing tables into Excel (XLSX) files using Claude Vision API with automatic rotation detection. Each table found in the PDF becomes a separate sheet in the output Excel file.

Novelty

3/10

Tags

python cli pdf excel claude-vision automation data-extraction image-processing

Technologies

anthropic

Claude Models

claude-opus-4-6

Quality Score

D
52.0/100
Structure
58
Code Quality
50
Documentation
65
Testing
0
Practices
62
Security
90
Dependencies
60

Strengths

  • Code linting configured (ruff (possible))
  • Consistent naming conventions (snake_case)
  • Good security practices \u2014 no major issues detected
  • Properly licensed project

Weaknesses

  • No tests found \u2014 high risk of regressions
  • No CI/CD configuration \u2014 manual testing and deployment
  • Potential hardcoded secrets in 1 files
  • 490 duplicate lines detected \u2014 consider DRY refactoring

Recommendations

  • Add a test suite \u2014 start with critical path integration tests
  • Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
  • Move hardcoded secrets to environment variables or a secrets manager

Security & Health

6.1h
Tech Debt (D)
A
OWASP (100%)
PASS
Quality Gate
A
Risk (7)
All rows scored by the Repobility analyzer (https://repobility.com)
MIT
License
16.1%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

python
82.3%
markdown
14.7%
toml
2.6%
text
0.5%

Frameworks

None detected

Concepts (2)

Source-of-truth: Repobility · https://repobility.com
CategoryNameDescriptionConfidence
Repobility's GitHub App fixes findings like these · https://github.com/apps/repobility-bot
auto_descriptionProject DescriptionAn intelligent Python library to convert PDF files containing tables into Excel (XLSX) files using Claude Vision API with automatic rotation detection. Each table found in the PDF becomes a separate sheet in the output Excel file.80%
auto_categoryLibrarylibrary70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/120688.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV