BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
-
Updated
Jun 11, 2024 - Python
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
The most powerful MICROSOFT WINDOWS hardening and benchmark! Work in progress -- experimental. Best security database you will have "2024", "11" parent
Take your packages for a jog!
https://db-benchmarks.com website
Source for the TechEmpower Framework Benchmarks project
Modern load testing tool using JavaScript and TypeScript, inspired by k6.
[ACL 2024 Main] NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism
A collection of MARL benchmarks based on TorchRL
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
ALPBench is a Python package for the specification, execution, and performance monitoring of active learning pipelines.
Foundation model benchmarking tool. Run any model on Amazon SageMaker and benchmark for performance across instance type and serving stack options.
🐰 Bencher - Continuous Benchmarking
Framework for benchmarking vector search engines
MINERS ⛏️: The semantic retrieval benchmark for evaluating multilingual language models.
MTEB: Massive Text Embedding Benchmark
Cista is a simple, high-performance, zero-copy C++ serialization & reflection library.
Add a description, image, and links to the benchmark topic page so that developers can more easily learn about it.
To associate your repository with the benchmark topic, visit your repo's landing page and select "manage topics."