Video Object Tracker

F 49 completed

Cli Tool

unknown / python · tiny

Files

830

LOC

Frameworks

Languages

Overview Files & Metrics Git Activity Call Graph Security Reports

Pipeline State

completed

Run ID

#355372

Phase

done

Progress

Started

Finished

2026-04-13 01:31:02

LLM tokens

Pipeline Metadata

Stage

Skipped

Decision

skip_scaffold_dup

Novelty

24.10

Framework unique

—

Isolation

—

Last stage change

2026-04-16 18:15:42

Deduplication group #47591

Member of a group with 1 similar repo(s) — canonical #65621 view group →

Top concepts (1)

Data/ML

Repobility · MCP-ready · https://repobility.com

AI Prompt

Create a command-line tool in Python for video object tracking. The tool should process a video file and use a multi-model pipeline involving FastVLM, Florence-2, and DINOv2 to detect, localize, and track objects across frames. I need to be able to configure VLM questions like object presence, hand use, and grasp type within `prompts.py`. The script should handle frame prefetching, write annotated frames, and optionally save a JSON log of detected segments, accepting arguments for input video, duration, and FPS.

python cli video-processing object-tracking computer-vision multimodal machine-learning

Generated by gemma4:latest

Catalog Information

A command‑line tool that detects, localizes, and tracks objects across video frames using a multi‑model vision‑language pipeline.

Description

This tool processes video frames through a three‑stage pipeline: a fast vision‑language model first gates the presence of any object, an open‑vocabulary detector then returns bounding boxes, and finally a visual‑embedding model extracts crop embeddings for similarity‑based tracking. Detected frames are grouped into segments by comparing embeddings against a running reference; short noise segments are merged automatically. The pipeline also queries additional VLM questions—hand use, grasp type, adult presence—providing richer semantic annotations. Optimizations such as frame prefetching, background JPEG writing, and targeted re‑rendering enable near real‑time performance on a CUDA GPU. The output includes an annotated video with segment IDs, reference thumbnails, and VLM answers, plus an optional JSON log of segments. It is ideal for researchers and developers who need precise, multi‑modal object tracking in video data.

الوصف

يقدّم هذا المشروع نظاماً متكاملاً لتتبع الكائنات داخل مقاطع الفيديو عبر سلسلة من النماذج المتعددة. يبدأ كل إطار بعملية فحص سريعة باستخدام نموذج FastVLM لتحديد وجود كائن، ثم يُستخرج موقع الكائن بدقة باستخدام نموذج Florence‑2 القابل للتعرف على أي فئة. بعد ذلك يُستخرج التمثيل البصري للكائنات المُكتشفة بواسطة DINOv2، ويُقارن هذا التمثيل مع مرجع مستمر لتحديد ما إذا كان الكائن مستمراً في المشهد أم أنه بداية لجزء جديد. يتم تجميع الإطارات التي تحمل نفس الكائن في “segments”، مع دمج القطاعات القصيرة التي تُعتبر ضوضاء تلقائياً مع الجيران. يضيف النظام أيضاً إجابات إضافية من نماذج VLM حول استخدام اليد، نوع القبضة، ووجود يد البالغ، ما يتيح تحليلاً أكثر تفصيلاً للسلوك البشري. يُحسّن الأداء عبر استدعاء الخلفية لتفريغ الإطارات، وكتابة JPEG في الخلفية، وإعادة رسم الإطارات فقط عند الحاجة، ما يحقق معالجة شبه في الوقت الحقيقي مع حفظ جودة الفيديو.

Novelty

8/10

Technologies

huggingface numpy pytorch

Claude Models

claude-opus-4.6

Quality Score

49.0/100

Structure

Code Quality

Documentation

Testing

Practices

Security

Dependencies

Strengths

Consistent naming conventions (snake_case)
Good security practices \u2014 no major issues detected

Weaknesses

No LICENSE file \u2014 legal ambiguity for contributors
No tests found \u2014 high risk of regressions
No CI/CD configuration \u2014 manual testing and deployment

Recommendations

Add a test suite \u2014 start with critical path integration tests
Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
Add a linter configuration to enforce code style consistency
Add a LICENSE file (MIT recommended for open source)

Security & Health

5.6h

Tech Debt (E)

OWASP (100%)

FAIL

Quality Gate

Risk (22)

Provenance: Repobility (https://repobility.com) — every score reproducible from /scan/

Unknown

License

8.8%

Duplication

Full Security Report AI Fix Prompts SARIF SBOM

Languages

python

92.9%

markdown

6.2%

text

0.9%

Frameworks

None detected

Concepts (1)

Data scored by Repobility · https://repobility.com
Category	Name	Description	Confidence
Powered by Repobility — scan your code at https://repobility.com
auto_category	Data/ML	data-ml	60%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/79496.svg)

Export Quality CSV Download SBOM Export Findings CSV

Video Object Tracker

Pipeline State

Pipeline Metadata

AI Prompt

Catalog Information

Description

الوصف

Novelty

Tags

Technologies

Claude Models

Quality Score

Strengths

Weaknesses

Recommendations

Security & Health

Languages

Frameworks

Concepts (1)

Quality Timeline

Embed Badge