Deriva Ml

C+ 72 completed
Library
cli / python · small
177
Files
59,719
LOC
2
Frameworks
5
Languages

Pipeline State

completed
Run ID
#303847
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Cataloged
Decision
proceed
Novelty
77.07
Framework unique
Isolation
Last stage change
2026-05-10 03:34:57
Deduplication group #56108
Member of a group with 3 similar repo(s) — canonical #47695 view group →
Top concepts (12)
Project DescriptionSingletonbusiness_logictestingTestingFactoryStrategyTestingFile ManagementDatabaseConfigurationLogging
Repobility · code-quality intelligence platform · https://repobility.com

AI Prompt

Create a command-line interface (CLI) Python library called DerivaML. The goal is to simplify creating and executing reproducible machine learning pipelines using a Deriva catalog. The project should utilize pytest for testing and SQLAlchemy for database interactions. Please structure the code to handle configuration using YAML, JSON, and TOML files, and ensure the documentation setup is ready, perhaps using mkdocs.yml.
python cli machine-learning deriva pandas pytest sqlalchemy mlops
Generated by gemma4:latest

Catalog Information

This project simplifies the use of Deriva and Pandas for creating reproducible machine learning pipelines.

Description

Deriva-ML is a collection of utilities designed to streamline the process of building machine learning pipelines using Deriva and Pandas. It aims to provide a simple and efficient way to create reproducible workflows, making it easier for data scientists to focus on model development rather than tedious setup tasks.

الوصف

هذا المشروع يهدف إلى تسهيل استخدام ديريفا وبينداس لإنشاء أنظمة تعلم الآلة قابلة للتكرار. يقدم مجموعة من الأدوات المُصممة لتسهيل عملية بناء الأنظمة، مما يجعلها أكثر سهولة في الاستخدام وتقليل الوقت المستغرق في التخطيط.

Novelty

5/10

Tags

machine-learning pipeline-creation reproducibility data-science workflow-management

Technologies

pandas pydantic sqlalchemy

Claude Models

claude-opus-4.6

Quality Score

C+
72.5/100
Structure
80
Code Quality
73
Documentation
80
Testing
85
Practices
52
Security
57
Dependencies
90

Strengths

  • CI/CD pipeline configured (github_actions)
  • Good test coverage (69% test-to-source ratio)
  • Code linting configured (ruff (possible))
  • Consistent naming conventions (snake_case)
  • Properly licensed project

Weaknesses

  • Potential hardcoded secrets in 1 files
  • 2516 duplicate lines detected \u2014 consider DRY refactoring
  • 16 'god files' with >500 LOC need decomposition

Recommendations

  • Move hardcoded secrets to environment variables or a secrets manager

Security & Health

8.8h
Tech Debt (A)
Medium
DORA Rating
A
OWASP (100%)
All rows above produced by Repobility · https://repobility.com
PASS
Quality Gate
A
Risk (0)
Apache-2.0
License
4.6%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

python
63.3%
json
29.8%
markdown
6.4%
yaml
0.3%
toml
0.2%

Frameworks

pytest SQLAlchemy

Symbols

variable614
method548
function163
class132
constant99
property63
protocol10

Concepts (12)

Same analyzer free for public repos: https://repobility.com
CategoryNameDescriptionConfidence
Want fix-PRs on findings? Install Repobility's GitHub App · github.com/apps/repobility-bot
auto_descriptionProject DescriptionDeriva-ML is a python library to simplify the process of creating and executing reproducible machine learning workflows using a deriva catalog.80%
design_patternSingletonFound get_instance/instance patterns70%
arch_layerbusiness_logicDetected business_logic layer70%
arch_layertestingDetected testing layer70%
auto_categoryTestingtesting70%
design_patternFactoryFound factory/create_ naming patterns60%
design_patternStrategyFound strategy/policy-named files60%
business_logicTestingDetected from 52 related files50%
business_logicFile ManagementDetected from 15 related files50%
business_logicDatabaseDetected from 33 related files50%
business_logicConfigurationDetected from 15 related files50%
business_logicLoggingDetected from 15 related files50%

Quality Timeline

1 quality score recorded.

View File Metrics
Open data scored by Repobility · https://repobility.com

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/27674.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV