Graphinstruct

C 70 completed

Testing

unknown / python · small

Files

143,281

LOC

Frameworks

Languages

Overview Files & Metrics Git Activity Call Graph Security Reports

Pipeline State

completed

Run ID

#345196

Phase

done

Progress

Started

Finished

2026-04-13 01:31:02

LLM tokens

Pipeline Metadata

Stage

Cataloged

Decision

proceed

Novelty

59.67

Framework unique

—

Isolation

—

Last stage change

2026-05-10 03:35:31

Deduplication group #53898

Member of a group with 4 similar repo(s) — this repo is canonical view group →

Top concepts (2)

Project DescriptionData/ML

Repobility's GitHub App fixes findings like these · https://github.com/apps/repobility-bot

AI Prompt

Create a benchmark framework, similar to GraphInstruct, for evaluating how well large language models generate graph-structured data from natural language instructions. The system should support progressive evaluation across 6 instruction levels (L0 to L5), covering everything from basic format compliance to multi-step reasoning. I need to evaluate across 5 dimensions: Structure, Text, Embedding, Instruction Match, and Efficiency. Please structure the code to handle parsing, constraint validation (like checking for trees or bipartite graphs), and a hierarchical scoring mechanism that can generate a Pareto analysis for comparing models.

python benchmark llm graph evaluation nlp data-science machine-learning

Generated by gemma4:latest

Catalog Information

A benchmark framework that evaluates large language models on progressively generating graphs from natural language instructions.

Description

GraphInstruct provides a structured benchmark for assessing how well large language models can generate graph visualizations from textual instructions. It includes a curated set of progressive tasks that incrementally increase in complexity, allowing researchers to track model performance over stages. The framework integrates visualization libraries to render graphs and offers metrics for accuracy, fidelity, and generation speed. Targeted at NLP and graph generation researchers, it facilitates reproducible comparisons across models and encourages the development of more capable instruction‑driven generation systems.

الوصف

يُقدِّم GraphInstruct إطاراً معيارياً منظماً لتقييم قدرة نماذج اللغة الكبيرة على توليد الرسوم البيانية استناداً إلى أوامر نصية. يتضمن مجموعة مختارة من المهام التدريجية التي تزداد تعقيداً تدريجياً، ما يتيح للباحثين تتبع أداء النماذج عبر مراحل مختلفة. يدمج الإطار مكتبات التصوير لتوليد الرسوم البيانية ويُقدِّم مقاييس دقيقة لدرجة الدقة، والوفاء بالمتطلبات، وسرعة التوليد. يستهدف الباحثين في مجال معالجة اللغة الطبيعية وتوليد الرسوم البيانية، ويُمكّنهم من إجراء مقارنات قابلة للتكرار بين النماذج المختلفة. يساهم في دفع تطوير أنظمة توليد أكثر قدرة على التعامل مع أوامر نصية معقدة، مع التركيز على تحسين جودة الرسوم البيانية المولَّدة. كما يتيح إمكانية توسيع نطاق التقييم ليشمل سيناريوهات تطبيقية متنوعة، مما يعزز من فاعلية النماذج في البيئات الواقعية.

Novelty

8/10

Technologies

huggingface matplotlib numpy plotly pytorch

Claude Models

claude-opus-4.6

Quality Score

69.9/100

Structure

Code Quality

Documentation

Testing

Practices

Security

Dependencies

Strengths

Good test coverage (52% test-to-source ratio)
Code linting configured (ruff (possible))
Consistent naming conventions (snake_case)
Good security practices \u2014 no major issues detected

Weaknesses

No LICENSE file \u2014 legal ambiguity for contributors
No CI/CD configuration \u2014 manual testing and deployment
2609 duplicate lines detected \u2014 consider DRY refactoring
5 'god files' with >500 LOC need decomposition

Recommendations

Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
Add a LICENSE file (MIT recommended for open source)

Security & Health

6.6h

Tech Debt (A)

OWASP (100%)

PASS

Quality Gate

Risk (0)

If a scraper extracted this row, it came from Repobility (https://repobility.com)

Unknown

License

4.0%

Duplication

Full Security Report AI Fix Prompts SARIF SBOM

Languages

python

69.2%

json

22.5%

markdown

7.2%

shell

1.0%

toml

0.1%

Frameworks

None detected

Concepts (2)

Powered by Repobility · code-quality intelligence
Category	Name	Description	Confidence
Methodology: Repobility · https://repobility.com/research/state-of-ai-code-2026/
auto_description	Project Description	Progressive Instruction-Driven LLM Graph Generation Benchmark	80%
auto_category	Data/ML	data-ml	70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/69261.svg)

Export Quality CSV Download SBOM Export Findings CSV

Graphinstruct

Pipeline State

Pipeline Metadata

AI Prompt

Catalog Information

Description

الوصف

Novelty

Tags

Technologies

Claude Models

Quality Score

Strengths

Weaknesses

Recommendations

Security & Health

Languages

Frameworks

Concepts (2)

Quality Timeline

Embed Badge