Legal Knowledge Base

C+ 70 completed
Web App
containerized / python · small
52
Files
2,699
LOC
3
Frameworks
6
Languages

Pipeline State

completed
Run ID
#366327
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Skipped
Decision
skip_scaffold_dup
Novelty
43.31
Framework unique
Isolation
Last stage change
2026-04-16 18:15:42
Deduplication group #48276
Member of a group with 1 similar repo(s) — canonical #84359 view group →
Top concepts (2)
Project DescriptionWeb Backend
Repobility · open methodology · https://repobility.com/research/

AI Prompt

Create a RAG-powered system for corporate legal teams to act as an institutional memory. I need it to allow users to upload various legal documents like PDFs, DOCX, TXT, or Markdown files. The core functionality should be natural-language Q&A, providing answers along with source citations and relevance scores. Please ensure it supports semantic search, metadata filtering by document type or practice area, and includes a learning tag system that auto-suggests metadata. The backend should use FastAPI, and the system should be containerized using Docker Compose.
python fastapi rag legal-tech document-retrieval sqlalchemy docker q&a semantic-search
Generated by gemma4:latest

Catalog Information

A RAG-powered system that delivers instant legal knowledge retrieval for corporate legal teams.

Description

This platform builds a searchable legal knowledge base that leverages Retrieval-Augmented Generation to answer complex legal questions. It integrates a vector store for fast document retrieval, an LLM for contextual responses, and a web interface for easy access. Users can upload internal documents, query the system, and receive concise, citation‑backed answers. Designed for law firms and corporate legal departments, it streamlines research, reduces time spent on manual searches, and ensures consistent information. The architecture combines FastAPI for the API layer, Streamlit for the UI, and a vector database for efficient similarity search.

الوصف

يُعد هذا النظام قاعدة معرفية قانونية تعتمد على Retrieval-Augmented Generation لتقديم إجابات دقيقة على الأسئلة القانونية المعقدة. يُدمج قاعدة بيانات متجهات لتسريع استرجاع الوثائق، ونموذج لغة كبير لتوليد الردود السياقية، وواجهة ويب سهلة الاستخدام. يتيح للمستخدمين رفع الوثائق الداخلية، طرح الأسئلة، والحصول على إجابات مختصرة مدعومة بالاستشهادات. يستهدف هذا الحل فرق القانون في الشركات ومكاتب المحاماة، مما يقلل الوقت المستغرق في البحث اليدوي ويضمن اتساق المعلومات. يُبنى على طبقة API باستخدام FastAPI، وواجهة Streamlit، وقاعدة بيانات متجهات لتوفير بحث مشابهة فعال. يُعزز النظام الذاكرة المؤسسية من خلال حفظ المعرفة القانونية في مكان مركزي، مما يسهل الوصول إليها عبر الزمن. يميز بقدرة على التحديث المستمر للبيانات، وتكامل سلس مع مصادر الوثائق الحالية، وتوفير تجربة بحث متقدمة دون الحاجة لخبرة تقنية متخصصة.

Novelty

7/10

Tags

legal-knowledge-retrieval question-answering document-search ai-driven institutional-memory regulatory-compliance law-firm-support

Technologies

anthropic chromadb fastapi huggingface pydantic sqlalchemy streamlit uvicorn

Claude Models

claude-opus-4.6

Quality Score

C+
70.2/100
Structure
71
Code Quality
80
Documentation
50
Testing
50
Practices
78
Security
92
Dependencies
60

Strengths

  • Code linting configured (ruff (possible))
  • Consistent naming conventions (snake_case)
  • Good security practices \u2014 no major issues detected
  • Containerized deployment (Docker)
  • Properly licensed project

Weaknesses

  • No CI/CD configuration \u2014 manual testing and deployment
  • 144 duplicate lines detected \u2014 consider DRY refactoring

Recommendations

  • Set up CI/CD (GitHub Actions recommended) to automate testing and deployment

Security & Health

4.1h
Tech Debt (C)
A
OWASP (100%)
PASS
Quality Gate
A
Risk (4)
Repobility · severity-and-effort ranking · https://repobility.com
MIT
License
3.2%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

python
80.6%
markdown
13.1%
shell
3.2%
toml
1.7%
yaml
0.9%
json
0.5%

Frameworks

FastAPI pytest SQLAlchemy

Concepts (2)

All metrics by Repobility · https://repobility.com
CategoryNameDescriptionConfidence
If a scraper extracted this row, it came from Repobility (https://repobility.com)
auto_descriptionProject DescriptionRAG-powered institutional memory for legal teams. Upload legal documents (memos, deal summaries, playbooks, policies) and ask natural-language questions to get answers with source citations.80%
auto_categoryWeb Backendweb-backend70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/90509.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV