Llm Bench

D 56 completed
Cli Tool
unknown / rust · tiny
33
Files
7,354
LOC
0
Frameworks
5
Languages

Pipeline State

completed
Run ID
#304081
Phase
done
Progress
1%
Started
Finished
2026-04-13 01:31:02
LLM tokens
0

Pipeline Metadata

Stage
Skipped
Decision
skip_scaffold_dup
Novelty
45.71
Framework unique
Isolation
Last stage change
2026-04-16 18:15:42
Deduplication group #48152
Member of a group with 1 similar repo(s) — canonical #94586 view group →
Top concepts (6)
RepositoryProject DescriptionCLI ToolFactoryAnalyticsConfiguration
All rows scored by the Repobility analyzer (https://repobility.com)

AI Prompt

Build me a high-performance benchmarking tool in Rust for testing OpenAI-compatible LLM inference servers. I need it to measure detailed performance characteristics by hitting the `/v1/chat/completions` endpoint. The tool must support streaming to measure Time-To-First-Token (TTFT) and Inter-Token Latency (ITL) percentiles (P50, P90, P95, P99). It should allow flexible load patterns, including fixed QPS, concurrent workers, and Poisson arrival distributions. Additionally, it needs robust error tracking for connection, HTTP 4xx/5xx, and timeouts, and it should analyze metrics based on context size (Small, Medium, Large, etc.).
rust benchmarking llm openai-api performance async tokio networking
Generated by gemma4:latest

Catalog Information

A high-performance benchmarking tool for OpenAI-compatible LLM inference servers, designed to measure detailed performance characteristics of local LLM servers.

Description

llm-bench is a high-performance benchmarking tool for OpenAI-compatible LLM inference servers. It measures detailed performance characteristics of local LLM servers like llama-server, vLLM, TGI, and other OpenAI API-compatible endpoints. The tool supports various load patterns, including concurrent mode, fixed QPS mode, arrival distributions, duration-based testing, request count mode, and warmup period.

الوصف

هو أداة تقييم الأداء عالية السرعة لخادمات الاستعلام المتوافقة مع LLM من OpenAI. يقيّم أداء خادمات LLM المحلية مثل llama-server، vLLM، TGI، وغيرها من نقاط النهاية المتناظرة مع API OpenAI. تدعم الأداة أنماط تحميل متعددة، بما في ذلك وضع التزامن، ووضع قيم الاستعلام الثابتة، وتوزيعات الوصول، والاختبار على أساس المدة، ووضع عدد الطلبات، وفترة التأهب.

Novelty

7/10

Tags

benchmarking performance-measurement llm-inference-server openai-api-compatibility streaming-support async-concurrent-testing

Technologies

serde tokio warp

Claude Models

claude-opus-4.6

Quality Score

D
55.9/100
Structure
55
Code Quality
52
Documentation
60
Testing
15
Practices
71
Security
100
Dependencies
80

Strengths

  • CI/CD pipeline configured (github_actions)
  • Consistent naming conventions (snake_case)
  • Good security practices \u2014 no major issues detected

Weaknesses

  • No LICENSE file \u2014 legal ambiguity for contributors
  • No tests found \u2014 high risk of regressions
  • 765 duplicate lines detected \u2014 consider DRY refactoring
  • 2 'god files' with >500 LOC need decomposition

Recommendations

  • Add a test suite \u2014 start with critical path integration tests
  • Add a linter configuration to enforce code style consistency
  • Add a LICENSE file (MIT recommended for open source)

Security & Health

7.1h
Tech Debt (C)
Medium
DORA Rating
A
OWASP (100%)
Repobility · open methodology · https://repobility.com/research/
PASS
Quality Gate
A
Risk (2)
MIT
License
15.4%
Duplication
Full Security Report AI Fix Prompts SARIF SBOM

Languages

rust
72.4%
markdown
13.7%
yaml
7.4%
toml
3.4%
python
3.1%

Frameworks

None detected

Symbols

function96
struct41
constant27
extension16
enum8
macro1

Concepts (6)

Source: Repobility analyzer (https://repobility.com)
CategoryNameDescriptionConfidence
Source: Repobility analyzer · https://repobility.com
design_patternRepositoryFound repository-named files80%
auto_descriptionProject DescriptionA high-performance benchmarking tool for OpenAI-compatible LLM inference servers. Designed to measure detailed performance characteristics of local LLM servers like llama-server, vLLM, TGI, and other OpenAI API-compatible endpoints.80%
auto_categoryCLI Toolcli70%
design_patternFactoryFound factory/create_ naming patterns60%
business_logicAnalyticsDetected from 3 related files50%
business_logicConfigurationDetected from 2 related files50%

Quality Timeline

1 quality score recorded.

View File Metrics
About: code-quality intelligence by Repobility · https://repobility.com

Embed Badge

Add to your README:

![Quality](https://repos.aljefra.com/badge/27908.svg)
Quality BadgeSecurity Badge
Export Quality CSVDownload SBOMExport Findings CSV

BinComp Dependency Hardening

All packages →
2 of this repo's dependencies have been scanned for binary hardening. Grade reflects RELRO / stack canary / FORTIFY / PIE coverage.
Nrequests2.33.1 · 0 gadgets · risk 3687.0Fpandas3.0.2 · 6,381 gadgets · risk 0.0