create industry strength dataset (#640)

zhimin-z · web-flow · commit dffe1cd4ea8d · 2025-03-04T16:02:13.000+08:00
diff --git a/README.md b/README.md
@@ -19,9 +19,10 @@ Additionally, we provide a [search toolkit](https://huggingface.co/spaces/zhimin
 | [📓 Data Science Notebook](#data-science-notebook) | [💾 Data Storage Optimisation](#data-storage-optimisation) | [💸 Data Stream Processing](#data-stream-processing) |
 | [💪 Deployment & Serving](#deployment-and-serving) | [📈 Evaluation & Monitoring](#evaluation-and-monitoring) | [🔍 Explainability & Fairness](#explainability-and-fairness) |
 | [🎁 Feature Store](#feature-store) | [🔴 Industry-strength Anomaly Detection](#industry-strength-anodet) | [👁️ Industry-strength Computer Vision](#industry-strength-cv) |
-| [🔥 Industry-strength Information Retrieval](#industry-strength-infret) | [🔠 Industry-strength Natural Language Processing](#industry-strength-nlp) | [🙌 Industry-strength Recommender System](#industry-strength-recsys) |
-| [🍕 Industry-strength Reinforcement Learning](#industry-strength-rl) | [📊 Industry-strength Visualisation](#industry-strength-visualisation) | [📅 Metadata Management](#metadata-management) |
-| [📜 Model, Data & Experiment Management](#model-data-and-experiment-management) | [🔩 Model Storage Optimisation](#model-storage-optimisation) | [🔏 Privacy & Robustness](#privacy-and-robustness) | [🏁 Training Orchestration](#training-orchestration) |
+| [🗂️ Industry-strength Dataset](#industry-strength-dataset) | [🔥 Industry-strength Information Retrieval](#industry-strength-infret) | [🔠 Industry-strength Natural Language Processing](#industry-strength-nlp) |
+| [🙌 Industry-strength Recommender System](#industry-strength-recsys) | [🍕 Industry-strength Reinforcement Learning](#industry-strength-rl) | [📊 Industry-strength Visualisation](#industry-strength-visualisation) |
+| [📅 Metadata Management](#metadata-management) | [📜 Model, Data & Experiment Management](#model-data-and-experiment-management) | [🔩 Model Storage Optimisation](#model-storage-optimisation) |
+| [🔏 Privacy & Robustness](#privacy-and-robustness) | [🏁 Training Orchestration](#training-orchestration) |
 
 ## Contributing to the list
 
@@ -393,6 +394,13 @@ Please review our [CONTRIBUTING.md](https://github.com/EthicalML/awesome-product
 * [supervision](https://github.com/roboflow/supervision) ![](https://img.shields.io/github/stars/roboflow/supervision.svg?style=social) - Supervision is a Python library designed for efficient computer vision pipeline management, providing tools for annotation, visualization, and monitoring of models.
 * [VideoSys](https://github.com/NUS-HPC-AI-Lab/VideoSys) ![](https://img.shields.io/github/stars/NUS-HPC-AI-Lab/VideoSys.svg?style=social) - VideoSys supports many diffusion models with our various acceleration techniques, enabling these models to run faster and consume less memory.
 
+## Industry Strength Dataset
+* [Dataset Viewer](https://github.com/EpistasisLab/pmlb) ![](https://img.shields.io/github/stars/EpistasisLab/pmlb.svg?style=social) - Dataset Viewer is a tool that enables users to interactively explore and analyze datasets by providing functionalities such as pagination, filtering, searching, and basic statistical insights. 
+* [DiffusionDB](https://github.com/poloclub/diffusiondb) ![](https://img.shields.io/github/stars/poloclub/diffusiondb.svg?style=social) - DiffusionDB is a large-scale text-to-image prompt gallery dataset based on Stable Diffusion.
+* [PMLB](https://github.com/EpistasisLab/pmlb) ![](https://img.shields.io/github/stars/EpistasisLab/pmlb.svg?style=social) - PMLB is a large, curated repository of benchmark datasets for evaluating supervised machine learning algorithms.
+* [SemanticKITTI](https://github.com/PRBonn/semantic-kitti-api) ![](https://img.shields.io/github/stars/PRBonn/semantic-kitti-api.svg?style=social) - SemanticKITTI helps developers to navigate, visualize, process, and evaluate results for point clouds and labels from the SemanticKITTI dataset.
+* [UltraFeedback](https://github.com/OpenBMB/UltraFeedback) ![](https://img.shields.io/github/stars/OpenBMB/UltraFeedback.svg?style=social) - UltraFeedback is a large-scale, fine-grained, diverse preference dataset, used for training powerful reward models and critic models.
+
 ## Industry Strength InfRet
 * [AutoRAG](https://github.com/Marker-Inc-Korea/AutoRAG) ![](https://img.shields.io/github/stars/Marker-Inc-Korea/AutoRAG.svg?style=social) - AutoRAG is a RAG AutoML tool for automatically finds an optimal RAG pipeline for your data.
 * [Cognita](https://github.com/truefoundry/cognita) ![](https://img.shields.io/github/stars/truefoundry/cognita.svg?style=social) - Cognita is a RAG framework for building modular and production-ready applications.