Vep Annotator

C 67 completed
Other
unknown / cpp · tiny
43
Files
20,794
LOC
0
Frameworks
4
Languages

Pipeline State

completed
Run ID
#1541613
Phase
done
Progress
0%
Started
2026-04-16 20:53:03
Finished
2026-04-16 20:53:03
LLM tokens
0

Pipeline Metadata

Stage
Cataloged
Decision
proceed
Novelty
44.00
Framework unique
Isolation
Last stage change
2026-05-10 03:34:51
Deduplication group #47258
Member of a group with 648 similar repo(s) — canonical #1577108 view group →
Powered by Repobility — scan your code at https://repobility.com

AI Prompt

Create a high-performance C++ tool that replicates the core functionality of Ensembl's Variant Effect Predictor (VEP). I need it to take variant data and perform comprehensive annotation. Key features must include consequence prediction using Sequence Ontology terms, calculating impact classifications (HIGH, MODERATE, LOW, MODIFIER), and generating HGVS notations (HGVSc, HGVSp, HGVSg). The tool should support multiple output formats: TSV (default), JSON, and VCF. Additionally, please incorporate support for pathogenicity predictions using scores like SIFT, CADD, and gnomAD, as well as structural variant annotation and handling of regulatory features like promoters and enhancers.
cpp bioinformatics variant-calling annotation cpp-plus-plus vcf json high-performance
Generated by gemma4:latest

Catalog Information

Create a high-performance C++ tool that replicates the core functionality of Ensembl's Variant Effect Predictor (VEP). I need it to take variant data and perform comprehensive annotation. Key features must include consequence prediction using Sequence Ontology terms, calculating impact classifications (HIGH, MODERATE, LOW, MODIFIER), and generating HGVS notations (HGVSc, HGVSp, HGVSg). The tool should support multiple output formats: TSV (default), JSON, and VCF. Additionally, please incorporate

Tags

cpp bioinformatics variant-calling annotation cpp-plus-plus vcf json high-performance

Quality Score

C
66.6/100
Structure
64
Code Quality
50
Documentation
70
Testing
60
Practices
78
Security
100
Dependencies
80

Strengths

  • Good test coverage (36% test-to-source ratio)
  • Consistent naming conventions (snake_case)
  • Good security practices — no major issues detected
  • Properly licensed project

Weaknesses

  • No CI/CD configuration — manual testing and deployment
  • 1714 duplicate lines detected — consider DRY refactoring
  • 6 'god files' with >500 LOC need decomposition

Recommendations

  • Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
  • Add a linter configuration to enforce code style consistency

Languages

cpp
96.2%
markdown
2.2%
text
1.0%
shell
0.6%

Frameworks

None detected

Symbols

method379
function118
class41
struct41
module25
enum12
type_alias2
macro1

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/1364989.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV
Provenance: Repobility (https://repobility.com) — every score reproducible from /scan/

BinComp Dependency Hardening

All packages →
1 of this repo's dependencies have been scanned for binary hardening. Grade reflects RELRO / stack canary / FORTIFY / PIE coverage.
Fregex2026.4.4 · 216 gadgets · risk 0.0