Vlm

D 52 completed

Api

containerized / python · tiny

Files

2,810

LOC

Frameworks

Languages

Overview Files & Metrics Git Activity Call Graph Security Reports

Pipeline State

completed

Run ID

#370846

Phase

done

Progress

Started

Finished

2026-04-13 01:31:02

LLM tokens

Pipeline Metadata

Stage

Skipped

Decision

skip_scaffold_dup

Novelty

38.52

Framework unique

—

Isolation

—

Last stage change

2026-04-16 18:15:42

Deduplication group #47702

Member of a group with 1 similar repo(s) — canonical #29960 view group →

Top concepts (2)

Project DescriptionWeb Backend

Repobility · MCP-ready · https://repobility.com

AI Prompt

Create an end-to-end Vision-Language Model pipeline for understanding temporal video data related to warehouse packaging operations. I need this built using FastAPI for the API, and it should handle video clip prediction via a POST endpoint. The system needs components for data loading, specifically using the OpenPack dataset, and must include scripts for both fine-tuning using QLoRA and evaluating the model using metrics like OCA, tIoU, and AA@1. Please structure the deployment using `docker-compose.yml` and provide the necessary Python scripts for the data pipeline and evaluation.

python fastapi vlm video-analysis machine-learning docker openpack nlp computer-vision

Generated by gemma4:latest

Catalog Information

A vision‑language API that classifies and predicts warehouse packaging operations from video clips.

Description

The service exposes a FastAPI endpoint that accepts short video clips of warehouse packaging operations and returns structured predictions about the operation type, its temporal boundaries, and the anticipated next step. It leverages a fine‑tuned Qwen2.5‑VL‑2B model, trained with QLoRA on the OpenPack dataset, to understand both visual content and textual labels. The pipeline includes motion‑adaptive frame sampling to capture key moments around operation transitions, improving temporal precision. Target users are logistics engineers and warehouse automation teams seeking real‑time analytics and predictive insights. The system addresses the need for accurate, low‑latency operation recognition in industrial video streams, reducing manual monitoring effort.

الوصف

يقدم هذا النظام نقطة نهاية FastAPI تستقبل مقاطع فيديو قصيرة تُظهر عمليات التعبئة في المستودعات وتعيد تنبؤات منظمة حول نوع العملية، والحدود الزمنية لها، والخطوة التالية المتوقعة. يعتمد على نموذج Qwen2.5‑VL‑2B مُحسَّن باستخدام QLoRA على مجموعة بيانات OpenPack لفهم المحتوى البصري والملصقات النصية معاً. يتضمن خط الأنابيب اختيار إطارات معتمد على الحركة لتسليط الضوء على اللحظات الرئيسية حول انتقالات العمليات، ما يحسن الدقة الزمنية. يستهدف المهندسين في مجال اللوجستيات وفرق أتمتة المستودعات الذين يحتاجون تحليلات فورية ورؤى تنبؤية. يحل النظام مشكلة الحاجة إلى التعرف الدقيق على العمليات في تدفقات الفيديو الصناعية مع زمن استجابة منخفض، مما يقلل الجهد اليدوي في المراقبة.

Novelty

7/10

Technologies

fastapi huggingface numpy pandas pytorch scikit-learn scipy uvicorn

Claude Models

claude-opus-4.6

Quality Score

52.4/100

Structure

Code Quality

Documentation

Testing

Practices

Security

Dependencies

Strengths

Consistent naming conventions (snake_case)
Good security practices \u2014 no major issues detected
Containerized deployment (Docker)

Weaknesses

No LICENSE file \u2014 legal ambiguity for contributors
No tests found \u2014 high risk of regressions
No CI/CD configuration \u2014 manual testing and deployment
165 duplicate lines detected \u2014 consider DRY refactoring
1 'god files' with >500 LOC need decomposition

Recommendations

Add a test suite \u2014 start with critical path integration tests
Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
Add a linter configuration to enforce code style consistency
Add a LICENSE file (MIT recommended for open source)

Security & Health

4.1h

Tech Debt (C)

OWASP (100%)

PASS

Quality Gate

Risk (4)

Provenance: Repobility (https://repobility.com) — every score reproducible from /scan/

Unknown

License

7.4%

Duplication

Full Security Report AI Fix Prompts SARIF SBOM

Languages

python

67.3%

json

22.9%

markdown

7.6%

yaml

1.3%

text

0.9%

Frameworks

FastAPI

Concepts (2)

Analysis by Repobility (https://repobility.com) · MCP-ready
Category	Name	Description	Confidence
Methodology: Repobility · https://repobility.com/research/state-of-ai-code-2026/
auto_description	Project Description	End-to-end Vision-Language Model pipeline for temporal video understanding in warehouse packaging operations, built on Qwen2.5-VL-2B with QLoRA fine-tuning on the OpenPack dataset.	80%
auto_category	Web Backend	web-backend	70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/95053.svg)

Export Quality CSV Download SBOM Export Findings CSV

Vlm

Pipeline State

Pipeline Metadata

AI Prompt

Catalog Information

Description

الوصف

Novelty

Tags

Technologies

Claude Models

Quality Score

Strengths

Weaknesses

Recommendations

Security & Health

Languages

Frameworks

Concepts (2)

Quality Timeline

Embed Badge