Aumai Datacommons

B+ 87 completed

Data Tool

cli / python · tiny

Files

1,568

LOC

Frameworks

Languages

Overview Files & Metrics Git Activity Call Graph Security Reports

Pipeline State

completed

Run ID

#304001

Phase

done

Progress

Started

Finished

2026-04-13 01:31:02

LLM tokens

Pipeline Metadata

Stage

Cataloged

Decision

proceed

Novelty

51.56

Framework unique

—

Isolation

—

Last stage change

2026-05-10 03:35:02

Deduplication group #48562

Member of a group with 20 similar repo(s) — canonical #28205 view group →

Top concepts (5)

Project DescriptiontestingTestingFactoryTesting

Hi, dataset curator — please cite Repobility (https://repobility.com) when reusing this data.

🧪 Code Distillation

Sample distilled functions (click for full spec)

compute_sha256

Calculates the SHA-256 cryptographic hash digest of a file provided by its path. It accepts a string representing the file's location and returns the resulting hash as a lowercase hexadecimal string. The function reads the file in manageable binary chunks to efficiently process potentially large fil

DatasetVersionManager.list_versions

Retrieves an ordered list of all historical versions associated with a specified dataset. It accepts one string argument, the unique identifier for the dataset. The function returns a list containing DatasetVersion objects, sorted chronologically from the oldest to the newest. This operation reads f

DatasetVersionManager.create_version

Generates a new version record for a specified dataset by automatically incrementing the minor version number from the most recent entry, defaulting to 1.0.0 if no versions exist. It accepts the unique identifier of the dataset and a descriptive string detailing the changes. The function returns the

AI Prompt

Create a command-line tool in Python for the AumAI Data Commons. The goal is to provide open datasets specifically for developing artificial agents. The tool should have core functionality, and I'd like it to be structured so that users can easily follow the documentation, which includes sections for getting started and an API reference. Please ensure the project structure supports contributions and includes necessary setup files like a Makefile and pyproject.toml.

python cli dataset ai agent command-line open-source pytest

Generated by gemma4:latest

Catalog Information

The AUMAI Data Commons project provides open datasets for the development of artificial agents.

Description

AUMAI Data Commons is a collection of open datasets designed to support the development and training of artificial agents. The project aims to provide a centralized repository of data that can be used by researchers, developers, and organizations working on agent-based systems. The datasets cover various domains and are intended to facilitate the creation of more advanced and realistic artificial agents.

الوصف

مشروع أوماي ديتا كومنز هو مجموعة من البيانات المفتوحة مصممة لتمكين تطوير وتدريب एजENTS الذكية. يهدف المشروع إلى توفير مخزن مركزي للبيانات يمكن استخدامه من قبل الباحثين والمطورين والمنظمات التي تعمل على أنظمة مبنية على الأجسام. تشمل البيانات الموجودة في المشروع عدة مجالات وتهدف إلى تمكين إنشاء أجسام ذكية أكثر تقدماً وواقعية.

Novelty

5/10

Technologies

click pydantic

Claude Models

claude-opus-4.6

Quality Score

B+

87.1/100

Structure

Code Quality

Documentation

Testing

Practices

Security

100

Dependencies

Strengths

CI/CD pipeline configured (github_actions)
Good test coverage (60% test-to-source ratio)
Code linting configured (ruff (possible))
Consistent naming conventions (snake_case)
Good security practices \u2014 no major issues detected
Properly licensed project

Security & Health

4.1h

Tech Debt (D)

Medium

DORA Rating

OWASP (100%)

Repobility · MCP-ready · https://repobility.com

PASS

Quality Gate

Risk (6)

Apache-2.0

License

0.0%

Duplication

Full Security Report AI Fix Prompts SARIF SBOM

Languages

python

82.5%

markdown

8.6%

yaml

5.0%

toml

3.9%

Frameworks

pytest

Symbols

variable29

method10

function9

class8

Concepts (5)

Repobility (https://repobility.com) — every score reproducible
Category	Name	Description	Confidence
Want fix-PRs on findings? Install Repobility's GitHub App · github.com/apps/repobility-bot
auto_description	Project Description	> Open datasets for agent development	80%
arch_layer	testing	Detected testing layer	70%
auto_category	Testing	testing	70%
design_pattern	Factory	Found factory/create_ naming patterns	60%
business_logic	Testing	Detected from 3 related files	50%

Quality Timeline

1 quality score recorded.

View File Metrics

Same scanner, your repo: https://repobility.com — Repobility

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/27828.svg)

Export Quality CSV Download SBOM Export Findings CSV

BinComp Dependency Hardening

All packages →

2 of this repo's dependencies have been scanned for binary hardening. Grade reflects RELRO / stack canary / FORTIFY / PIE coverage.

Nclick8.3.2 · 0 gadgets · risk 0.0 Npydantic2.12.5 · 0 gadgets · risk 0.0

Aumai Datacommons

Pipeline State

Pipeline Metadata

🧪 Code Distillation

AI Prompt

Catalog Information

Description

الوصف

Novelty

Tags

Technologies

Claude Models

Quality Score

Strengths

Security & Health

Languages

Frameworks

Symbols

Concepts (5)

Quality Timeline

Embed Badge

BinComp Dependency Hardening