Tgprediction

C 65 completed
Ai Ml
unknown / python · small
100
Files
26,992
LOC
0
Frameworks
4
Languages

Pipeline State

completed
Run ID
#345197
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Skipped
Decision
skip_scaffold_dup
Novelty
42.67
Framework unique
Isolation
Last stage change
2026-04-16 18:15:42
Deduplication group #47727
Member of a group with 1 similar repo(s) — canonical #18123 view group →
Top concepts (2)
Project DescriptionData/ML
Repobility (the analyzer behind this table) · https://repobility.com

AI Prompt

Create a machine learning project in Python designed to predict the glass transition temperature ($\text{T}_{\text{g}}$) of polymers. The system needs to handle end-to-end prediction, accepting DNA or RNA sequences as input and outputting the predicted $\text{T}_{\text{g}}$ value, optionally in JSON format. It should support both interactive and command-line usage, allowing users to specify the sequence and type. Additionally, include functionality for feature engineering, starting from sequence-to-SMILES conversion, and running comparative experiments like transfer learning and SHAP analysis.
python machine-learning polymer-science cheminformatics predictive-modeling
Generated by gemma4:latest

Catalog Information

This project predicts the glass transition temperature of polymers, including nucleic acid-based polymers, using machine learning models for researchers and material scientists.

Description

The system integrates a large curated dataset of over 22,000 polymer glass transition temperatures and applies multi‑level feature engineering to extract 34 physicochemical descriptors from SMILES representations. A gradient‑boosted regression model, trained on both high‑quality baseline data and a transfer‑learning bridge set of hydrogen‑bonding polymers, delivers end‑to‑end predictions directly from DNA or RNA sequences. Users can interactively input sequences or run batch predictions via a command‑line interface, receiving Tg estimates along with confidence scores. The tool also provides SHAP‑based feature importance visualizations and supports exporting results in JSON for downstream workflows. Designed for polymer scientists and materials researchers, it accelerates the discovery of high‑performance polymers and aids in selecting materials for temperature‑critical applications.

الوصف

يُدمج النظام مجموعة بيانات ضخمة تضم أكثر من 22,000 قيمة لدرجة انتقال التموج للبوليمرات، ويُطبّق هندسة ميزات متعددة المستويات لاستخراج 34 وصفًا فيزيكوكيميائيًا من تمثيلات SMILES. يُستخدم نموذج انحدار تعزيز متدرج، مدرب على بيانات أساسية عالية الجودة ومجموعة جسرية من البوليمرات التي تتفاعل مع الروابط الهيدروجينية، لتقديم تنبؤات شاملة مباشرة من تسلسلات DNA أو RNA. يمكن للمستخدمين إدخال التسلسلات بشكل تفاعلي أو تشغيل تنبؤات دفعة عبر واجهة سطر أوامر، مع الحصول على تقديرات Tg مع درجات ثقة. كما يوفر الأداة تصورات أهمية الميزات باستخدام SHAP، ويتيح تصدير النتائج بصيغة JSON لتكاملها مع سير العمل. صُمم خصيصًا لعلماء البوليمرات والباحثين في مجال المواد، ويسرّع اكتشاف البوليمرات عالية الأداء ويساعد في اختيار المواد للتطبيقات الحرارية الحساسة.

Novelty

7/10

Tags

polymer-science glass-transition-temperature machine-learning nucleic-acid-prediction feature-engineering data-integration end‑to‑end-modeling

Technologies

matplotlib numpy pandas scikit-learn

Claude Models

claude-opus-4.6

Quality Score

C
65.3/100
Structure
63
Code Quality
63
Documentation
80
Testing
60
Practices
52
Security
80
Dependencies
60

Strengths

  • Good test coverage (31% test-to-source ratio)
  • Consistent naming conventions (snake_case)

Weaknesses

  • No LICENSE file \u2014 legal ambiguity for contributors
  • No CI/CD configuration \u2014 manual testing and deployment
  • 2046 duplicate lines detected \u2014 consider DRY refactoring
  • 7 'god files' with >500 LOC need decomposition

Recommendations

  • Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
  • Add a linter configuration to enforce code style consistency
  • Add a LICENSE file (MIT recommended for open source)

Security & Health

7.1h
Tech Debt (A)
A
OWASP (100%)
PASS
Quality Gate
A
Risk (1)
Methodology: Repobility · https://repobility.com/research/state-of-ai-code-2026/
Unknown
License
6.9%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

python
60.2%
markdown
38.2%
json
1.3%
text
0.3%

Frameworks

None detected

Concepts (2)

Open methodology · Repobility · https://repobility.com/research/
CategoryNameDescriptionConfidence
Repobility · code-quality intelligence · https://repobility.com
auto_descriptionProject Description同济大学 SITP(大学生创新训练计划)项目 —— 基于机器学习的聚合物玻璃化转变温度 (Tg) 预测,支持从核酸序列端到端预测 Tg。80%
auto_categoryData/MLdata-ml70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/69262.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV