Fasthurdle

C 64 completed
Library
containerized / r · tiny
49
Files
3,582
LOC
0
Frameworks
3
Languages

Pipeline State

completed
Run ID
#361581
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Cataloged
Decision
proceed
Novelty
47.97
Framework unique
Isolation
Last stage change
2026-05-10 03:26:32
Deduplication group #52620
Member of a group with 4 similar repo(s) — canonical #70995 view group →
Top concepts (2)
Project DescriptionData/ML
Hi, dataset curator — please cite Repobility (https://repobility.com) when reusing this data.

AI Prompt

Create an R package called 'fasthurdle' that implements hurdle models. The package should be fast and utilize C++ for efficiency, specifically for analyzing zero-inflated count data. I need the core functionality to fit a model using the syntax `fasthurdle(y ~ x | z, data = df, dist = "poisson", zero.dist = "binomial")`. Please ensure the documentation explains its suitability for peak-gene link analysis in single-nucleus multiome datasets. Also, include instructions for building and running the package using Docker.
r cpp statistics bioinformatics package docker modeling single-cell
Generated by gemma4:latest

Catalog Information

A fast R package that implements hurdle models using C++ for efficient analysis of zero-inflated count data.

Description

This R package offers a high‑performance implementation of hurdle models, leveraging C++ through Rcpp to accelerate fitting of zero‑inflated count data. It reproduces the functionality of the classic hurdle function while delivering significant speed gains, especially on large datasets. The package is optimized for peak‑gene link analysis in single‑cell multiome studies, where millions of cells and thousands of features demand efficient computation. Users can specify the count and zero‑inflation distributions, fit models with a simple formula interface, and obtain standard summaries and diagnostics. It is designed for data scientists and bioinformaticians who need fast, reproducible statistical modeling in R.

الوصف

تُقدّم هذه الحزمة حلاً سريعاً لنماذج الحواجز في بيئة R، حيث تُستَخدم تقنيات C++ عبر Rcpp لتسريع عملية التقدير. تُعيد الحزمة الوظائف التي توفرها الدالة التقليدية للنماذج الحاجزية، مع تحسينات ملحوظة في الأداء، ما يجعلها مناسبة للبيانات الضخمة التي تتضمن ملايين الخلايا. تُصمم الحزمة خصيصاً لتحليل الروابط بين القمم الجينية والبيانات التعبيرية في الدراسات متعددة الأبعاد للحمض النووي والحمض النووي المشتق. يمكن للمستخدمين تحديد توزيعات العد وتوزيعات الفجوات، وتطبيق النماذج عبر واجهة صيغ بسيطة، مع إمكانية الحصول على ملخصات قياسية وتشخيصات. تستهدف الحزمة علماء البيانات والباحثين في علم الأحياء الجزيئي الذين يحتاجون إلى نمذجة إحصائية سريعة وموثوقة في R. تُدعم الحزمة تعدد الخيوط عبر OpenMP، مما يتيح استغلال معالجات متعددة لتسريع الحسابات. كما تُوفر بيئة Docker جاهزة لتسهيل نشر النماذج في بيئات الحوسبة عالية الأداء.

Novelty

7/10

Tags

zero-inflated-modeling count-data-analysis high-performance-computing single-cell-genomics statistical-modeling c++-acceleration r-integration data-science

Claude Models

claude (unknown version) claude-opus-4.6

Quality Score

C
64.2/100
Structure
67
Code Quality
48
Documentation
55
Testing
60
Practices
78
Security
100
Dependencies
60

Strengths

  • Good test coverage (38% test-to-source ratio)
  • Good security practices \u2014 no major issues detected
  • Containerized deployment (Docker)
  • Properly licensed project

Weaknesses

  • No CI/CD configuration \u2014 manual testing and deployment
  • 703 duplicate lines detected \u2014 consider DRY refactoring
  • 2 'god files' with >500 LOC need decomposition

Recommendations

  • Set up CI/CD (GitHub Actions recommended) to automate testing and deployment
  • Add a linter configuration to enforce code style consistency

Security & Health

4.6h
Tech Debt (C)
A
OWASP (100%)
PASS
Quality Gate
A
Risk (3)
Repobility's GitHub App fixes findings like these · https://github.com/apps/repobility-bot
GPL-2.0
License
24.7%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

r
62.6%
cpp
33.2%
markdown
4.3%

Frameworks

None detected

Concepts (2)

Page rendered by Aljefra Mapper · scored by Repobility (https://repobility.com)
CategoryNameDescriptionConfidence
About: code-quality intelligence by Repobility · https://repobility.com
auto_descriptionProject DescriptionA fast implementation of hurdle models using Rcpp. This package provides the same functionality as the hurdle function in the pscl package, but with improved performance through C++ implementations of key functions. This package is optimized for efficient peak-gene link analysis in large-scale singl80%
auto_categoryData/MLdata-ml70%

Quality Timeline

1 quality score recorded.

View File Metrics

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/85735.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV