Llm4Graphgen Repro Stage0 4 Bundle

C 62 completed
Other
unknown / python · small
244
Files
111,212
LOC
1
Frameworks
6
Languages

Pipeline State

completed
Run ID
#345198
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Skipped
Decision
skip_scaffold_dup
Novelty
41.33
Framework unique
Isolation
Last stage change
2026-04-16 18:15:42
Deduplication group #47518
Member of a group with 1 similar repo(s) — canonical #113957 view group →
Top concepts (2)
Project DescriptionTesting
Open data scored by Repobility · https://repobility.com

AI Prompt

Create a Python project structure to reproduce the experimental results from the LLM4GraphGen paper (arXiv:2403.14358). The tool needs to support two experimental paths: one using the OpenAI API (like GPT-4) for local execution, and another for deploying LLaMA-2-13B-Chat on a GPU server. The core functionality should include modules for different prompting strategies, parsers to extract graph structures from LLM output, and specific evaluation metrics for Stage 2 (rule-based), Stage 3 (distribution), and Stage 4 (molecular properties using GIN). Please include setup scripts and examples for running these tests using pytest.
python llm graph-generation openai gpu pytest machine-learning research api
Generated by gemma4:latest

Catalog Information

Reproduce the experimental results of the LLM4GraphGen paper, enabling researchers to validate and extend graph generation using large language models.

Description

This project implements the experimental pipeline described in the LLM4GraphGen paper, providing a faithful reproduction of its graph generation results. It leverages PyTorch, NumPy, pandas, and scikit‑learn to preprocess data, fine‑tune language models, and evaluate generated graphs against ground truth. The code is organized into modular scripts that handle data loading, model training, inference, and metric computation, allowing users to replicate each step of the original study. Researchers can use the framework to verify reported performance, compare alternative models, or extend the methodology to new graph domains. The project emphasizes reproducibility, with clear documentation and reproducible environment specifications.

الوصف

يُقدّم هذا المشروع تنفيذًا دقيقًا للخطوات التجريبية التي وردت في ورقة LLM4GraphGen، مع التركيز على إعادة إنتاج نتائج توليد الرسوم البيانية باستخدام نماذج اللغة الكبيرة. يعتمد على مكتبات بايثون مثل PyTorch وNumPy وpandas وscikit‑learn لتنفيذ معالجة البيانات، وتدريب النماذج، وحساب المقاييس. يُنظم الكود في مجموعة من السكربتات القابلة للتعديل، بحيث يمكن للمستخدم تحميل البيانات، وتدريب النموذج، وإجراء التنبؤات، ثم تقييم النتائج مقابل البيانات الأصلية. يتيح الإطار للباحثين التحقق من الأداء المعلن، ومقارنة نماذج بديلة، أو توسيع المنهجية لتطبيقات رسومية جديدة. يركز المشروع على قابلية التكرار، مع توثيق واضح ومواصفات بيئة قابلة للتنفيذ. كما يوفر أدوات لتوليد مجموعات بيانات اصطناعية تُستخدم في مهام تحليل الشبكات أو التعلم الآلي. يميز المشروع بمرونته في التعامل مع مجموعات بيانات مختلفة، وقدرته على دمج نماذج جديدة بسهولة.

Novelty

7/10

Tags

graph-generation large-language-models reproducibility synthetic-data machine-learning experimental-design benchmarking

Technologies

numpy pandas pytorch scikit-learn

Claude Models

claude-opus-4.6

Quality Score

C
62.2/100
Structure
65
Code Quality
72
Documentation
54
Testing
50
Practices
55
Security
74
Dependencies
60

Strengths

  • Code linting configured (ruff (possible))
  • Consistent naming conventions (snake_case)

Weaknesses

  • No LICENSE file \u2014 legal ambiguity for contributors
  • No CI/CD configuration \u2014 manual testing and deployment
  • 239 duplicate lines detected \u2014 consider DRY refactoring
  • 3 'god files' with >500 LOC need decomposition

Recommendations

  • Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
  • Add a LICENSE file (MIT recommended for open source)

Security & Health

6.1h
Tech Debt (A)
A
OWASP (100%)
PASS
Quality Gate
A
Risk (0)
Repobility's GitHub App fixes findings like these · https://github.com/apps/repobility-bot
Unknown
License
7.0%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

python
78.4%
markdown
13.5%
json
4.2%
shell
3.0%
toml
0.9%
text
0.0%

Frameworks

pytest

Concepts (2)

Analysis by Repobility (https://repobility.com) · MCP-ready
CategoryNameDescriptionConfidence
Want fix-PRs on findings? Install Repobility's GitHub App · github.com/apps/repobility-bot
auto_descriptionProject Description本仓库从零开始复现论文 arXiv:2403.14358(Exploring the Potential of Large Language Models in Graph Generation)。80%
auto_categoryTestingtesting70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/69263.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV