Tg Harvest

B 80 completed
Cli Tool
cli / python · small
134
Files
10,632
LOC
1
Frameworks
5
Languages

Pipeline State

completed
Run ID
#352636
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Skipped
Decision
skip_scaffold_dup
Novelty
44.33
Framework unique
Isolation
Last stage change
2026-04-16 18:15:42
Deduplication group #47626
Member of a group with 2 similar repo(s) — canonical #93576 view group →
Top concepts (2)
Project DescriptionTesting
Generated by Repobility's multi-pass static-analysis pipeline (https://repobility.com)

AI Prompt

Create a command-line tool in Python that harvests data from Telegram. I need it to connect via the MTProto API and be able to extract messages, media metadata, reactions, views, and forwards from various sources like channels, groups, bots, and private chats, even restricted ones. The tool should support running commands like listing channels, parsing a public channel by username, parsing a bot conversation, or parsing a private chat using a numeric ID. It should also include a search function to query the parsed data and ideally offer a web UI interface as well.
python cli telegram mtproto data-harvesting osint automation api
Generated by gemma4:latest

Catalog Information

A Python tool that harvests Telegram data from channels, groups, bots, and private chats via the MTProto API.

Description

tg-harvest is a command‑line utility written in Python that connects to Telegram’s MTProto API to retrieve message histories, media, and metadata from channels, groups, bots, and private chats. It offers a clean, typed interface using pydantic models, and presents results in a readable format with rich console output. Users can optionally launch a lightweight Streamlit dashboard to visualize data distributions with Plotly charts. The tool is designed for researchers, analysts, and developers who need structured Telegram data for compliance, sentiment analysis, or content curation. It handles pagination, rate limits, and authentication securely, making data extraction reliable and repeatable.

الوصف

تُعدّ tg‑harvest أداة سطر أوامر مكتوبة بلغة بايثون تتصل بواجهة MTProto الخاصة بتيليغرام لاسترجاع سجلات الرسائل، الوسائط، والبيانات الوصفية من القنوات والمجموعات والروبوتات والدردشات الخاصة. تُقدّم واجهة مدعومة بالنماذج المدققة عبر مكتبة pydantic، وتعرض النتائج في تنسيق سهل القراءة باستخدام مكتبة rich في الطرفية. كما تسمح بفتح لوحة تحكم Streamlit خفيفة الوزن لعرض الرسوم البيانية التفاعلية باستخدام Plotly، ما يتيح تحليل التوزيعات الزمنية والموضوعية للبيانات. صُممت الأداة للباحثين، المحللين، ومطوري البرمجيات الذين يحتاجون إلى بيانات تيليغرام منظمة لأغراض الامتثال، تحليل المشاعر، أو تجميع المحتوى. تدير الأداة عمليات التصفح، حدود المعدل، والمصادقة بأمان، مما يجعل عملية استخراج البيانات موثوقة وقابلة للتكرار.

Novelty

6/10

Tags

telegram-data-extraction message-harvesting chat-analytics content-scraping data-collection

Technologies

click plotly pydantic rich streamlit

Claude Models

claude-opus-4.6

Quality Score

B
80.0/100
Structure
91
Code Quality
87
Documentation
52
Testing
85
Practices
69
Security
90
Dependencies
60

Strengths

  • CI/CD pipeline configured (github_actions)
  • Good test coverage (97% test-to-source ratio)
  • Code linting configured (ruff (possible))
  • Consistent naming conventions (snake_case)
  • Good security practices \u2014 no major issues detected
  • Properly licensed project

Weaknesses

  • Potential hardcoded secrets in 1 files
  • 232 duplicate lines detected \u2014 consider DRY refactoring
  • 1 'god files' with >500 LOC need decomposition

Recommendations

  • Move hardcoded secrets to environment variables or a secrets manager

Security & Health

8.8h
Tech Debt (B)
A
OWASP (100%)
PASS
Quality Gate
A
Risk (2)
Same scanner, your repo: https://repobility.com — Repobility
MIT
License
2.3%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

python
85.7%
markdown
7.9%
json
5.4%
toml
0.6%
yaml
0.4%

Frameworks

pytest

Concepts (2)

Repobility · code-quality scanner for AI-generated software · https://repobility.com
CategoryNameDescriptionConfidence
Repobility · severity-and-effort ranking · https://repobility.com
auto_descriptionProject Description![CI](https://github.com/klivak/tg-harvest/actions/workflows/ci.yml)80%
auto_categoryTestingtesting70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/76748.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV