Dmt Eval

Name: Aljefra Mapper analysis
Creator: Repobility
License: https://repobility.com/legal/terms/

C+ 76 completed

Library

unknown / python · small

Files

14,148

LOC

Frameworks

Languages

Overview Files & Metrics Git Activity Call Graph Security Reports

Pipeline State

completed

Run ID

#359099

Phase

done

Progress

Started

Finished

2026-04-13 01:31:02

LLM tokens

Pipeline Metadata

Stage

Skipped

Decision

skip_scaffold_dup

Novelty

42.67

Framework unique

—

Isolation

—

Last stage change

2026-04-16 18:15:42

Deduplication group #47778

Member of a group with 1 similar repo(s) — canonical #22814 view group →

Top concepts (2)

Project DescriptionTesting

AI Prompt

Create a universal validation framework in Python called dmt-eval. I need it to assess data quality, model performance, and test coverage for various computational models. The framework should support model-agnostic adapters to evaluate different types of models, generate structured narrative reports (like LabReports), and allow for parameterized measurement sweeps. The core functionality should allow evaluating a model against a dataset using metrics like accuracy and latency, and finally generating a structured Markdown report. Please ensure it uses pytest for testing.

python pytest ai-evaluation llm framework data-validation scientific-computing testing

Generated by gemma4:latest

Catalog Information

A universal validation framework that uses large language models to assess data quality, model performance, and test coverage.

Description

This framework provides a unified approach to validate data sets, evaluate machine‑learning models, and analyze test coverage using large language models. It offers a modular test‑case engine that can be scripted in Python or invoked from the command line, and it integrates with pandas for data manipulation. By leveraging OpenAI and Anthropic APIs, it generates detailed reports highlighting strengths, weaknesses, and actionable insights. The tool is designed for data scientists, ML engineers, and QA teams who need automated, repeatable validation across the entire development pipeline. It addresses common pain points such as inconsistent data, hidden model biases, and incomplete test suites, helping teams deliver higher‑quality products faster.

الوصف

يُقدّم هذا الإطار حلاً شاملاً لتقييم جودة البيانات، أداء النماذج، وتغطية الاختبارات باستخدام نماذج اللغة الكبيرة. يتيح للمستخدمين إنشاء مجموعات اختبار مخصصة تُطبق على مجموعات البيانات أو نماذج التعلم الآلي، مع إمكانية دمج تحليلات متقدمة عبر مكتبة pandas. يعتمد على واجهات برمجية للذكاء الاصطناعي مثل OpenAI وAnthropic لتوليد تقارير تفصيلية تُظهر نقاط القوة والضعف في كل مرحلة. يُسهل التكامل مع خطوط الأنابيب CI/CD، مما يضمن فحصًا تلقائيًا قبل نشر أي تحديث. يستهدف المهندسين الذين يعملون على تطوير نماذج أو أنظمة بيانات، ويساعدهم على تقليل الأخطاء وتحسين موثوقية المنتجات. يبرز عن الحلول التقليدية بقدرته على الجمع بين التحقق اليدوي والذكاء الاصطناعي في إطار موحد.

Novelty

8/10

Technologies

anthropic openai pandas

Claude Models

claude-opus-4.6

Quality Score

C+

75.9/100

Structure

Code Quality

Documentation

Testing

Practices

Security

Dependencies

Strengths

Good test coverage (44% test-to-source ratio)
Code linting configured (ruff (possible))
Consistent naming conventions (snake_case)
Good security practices \u2014 no major issues detected

Weaknesses

No LICENSE file \u2014 legal ambiguity for contributors
No CI/CD configuration \u2014 manual testing and deployment
Potential hardcoded secrets in 1 files
328 duplicate lines detected \u2014 consider DRY refactoring

Recommendations

Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
Add a LICENSE file (MIT recommended for open source)
Move hardcoded secrets to environment variables or a secrets manager

Security & Health

5.1h

Tech Debt (A)

OWASP (100%)

PASS

Quality Gate

Risk (1)

Generated by Repobility's multi-pass static-analysis pipeline (https://repobility.com)

MIT

License

4.1%

Duplication

Full Security Report AI Fix Prompts SARIF SBOM

Languages

python

94.6%

markdown

3.3%

toml

1.1%

shell

1.1%

Frameworks

pytest

Concepts (2)

Open methodology · Repobility · https://repobility.com/research/
Category	Name	Description	Confidence
Powered by Repobility — scan your code at https://repobility.com
auto_description	Project Description	Data, Models, Tests — universal validation framework for the age of AI agents.	80%
auto_category	Testing	testing	70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/83242.svg)

Export Quality CSV Download SBOM Export Findings CSV

Dmt Eval

Pipeline State

Pipeline Metadata

AI Prompt

Catalog Information

Description

الوصف

Novelty

Tags

Technologies

Claude Models

Quality Score

Strengths

Weaknesses

Recommendations

Security & Health

Languages

Frameworks

Concepts (2)

Quality Timeline

Embed Badge