Skip to content

Commit d70dddb

Browse files
mudlergithub-actions[bot]
authored andcommitted
⬆️ Checksum updates in gallery/index.yaml
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
1 parent 2fcfe54 commit d70dddb

File tree

1 file changed

+16
-22
lines changed

1 file changed

+16
-22
lines changed

gallery/index.yaml

Lines changed: 16 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -584,24 +584,24 @@
584584
- https://huggingface.co/Daemontatox/Qwen3-14B-Griffon
585585
- https://huggingface.co/mradermacher/Qwen3-14B-Griffon-i1-GGUF
586586
description: |
587-
This is a fine-tuned version of the Qwen3-14B model using the high-quality OpenThoughts2-1M dataset. Fine-tuned with Unsloth’s TRL-compatible framework and LoRA for efficient performance, this model is optimized for advanced reasoning tasks, especially in math, logic puzzles, code generation, and step-by-step problem solving.
588-
Training Dataset
587+
This is a fine-tuned version of the Qwen3-14B model using the high-quality OpenThoughts2-1M dataset. Fine-tuned with Unsloth’s TRL-compatible framework and LoRA for efficient performance, this model is optimized for advanced reasoning tasks, especially in math, logic puzzles, code generation, and step-by-step problem solving.
588+
Training Dataset
589589

590-
Dataset: OpenThoughts2-1M
591-
Source: A synthetic dataset curated and expanded by the OpenThoughts team
592-
Volume: ~1.1M high-quality examples
593-
Content Type: Multi-turn reasoning, math proofs, algorithmic code generation, logical deduction, and structured conversations
594-
Tools Used: Curator Viewer
590+
Dataset: OpenThoughts2-1M
591+
Source: A synthetic dataset curated and expanded by the OpenThoughts team
592+
Volume: ~1.1M high-quality examples
593+
Content Type: Multi-turn reasoning, math proofs, algorithmic code generation, logical deduction, and structured conversations
594+
Tools Used: Curator Viewer
595595

596-
This dataset builds upon OpenThoughts-114k and integrates strong reasoning-centric data sources like OpenR1-Math and KodCode.
597-
Intended Use
596+
This dataset builds upon OpenThoughts-114k and integrates strong reasoning-centric data sources like OpenR1-Math and KodCode.
597+
Intended Use
598598

599-
This model is particularly suited for:
599+
This model is particularly suited for:
600600

601-
Chain-of-thought and step-by-step reasoning
602-
Code generation with logical structure
603-
Educational tools for math and programming
604-
AI agents requiring multi-turn problem-solving
601+
Chain-of-thought and step-by-step reasoning
602+
Code generation with logical structure
603+
Educational tools for math and programming
604+
AI agents requiring multi-turn problem-solving
605605
overrides:
606606
parameters:
607607
model: Qwen3-14B-Griffon.i1-Q4_K_M.gguf
@@ -7078,13 +7078,7 @@
70787078
urls:
70797079
- https://huggingface.co/ServiceNow-AI/Apriel-Nemotron-15b-Thinker
70807080
- https://huggingface.co/bartowski/ServiceNow-AI_Apriel-Nemotron-15b-Thinker-GGUF
7081-
description: |
7082-
Apriel-Nemotron-15b-Thinker is a 15 billion‑parameter reasoning model in ServiceNow’s Apriel SLM series which achieves competitive performance against similarly sized state-of-the-art models like o1‑mini, QWQ‑32b, and EXAONE‑Deep‑32b, all while maintaining only half the memory footprint of those alternatives. It builds upon the Apriel‑15b‑base checkpoint through a three‑stage training pipeline (CPT, SFT and GRPO).
7083-
Highlights
7084-
Half the size of SOTA models like QWQ-32b and EXAONE-32b and hence memory efficient.
7085-
It consumes 40% less tokens compared to QWQ-32b, making it super efficient in production. 🚀🚀🚀
7086-
On par or outperforms on tasks like - MBPP, BFCL, Enterprise RAG, MT Bench, MixEval, IFEval and Multi-Challenge making it great for Agentic / Enterprise tasks.
7087-
Competitive performance on academic benchmarks like AIME-24 AIME-25, AMC-23, MATH-500 and GPQA considering model size.
7081+
description: "Apriel-Nemotron-15b-Thinker is a 15 billion‑parameter reasoning model in ServiceNow’s Apriel SLM series which achieves competitive performance against similarly sized state-of-the-art models like o1‑mini, QWQ‑32b, and EXAONE‑Deep‑32b, all while maintaining only half the memory footprint of those alternatives. It builds upon the Apriel‑15b‑base checkpoint through a three‑stage training pipeline (CPT, SFT and GRPO).\nHighlights\n Half the size of SOTA models like QWQ-32b and EXAONE-32b and hence memory efficient.\n It consumes 40% less tokens compared to QWQ-32b, making it super efficient in production. \U0001F680\U0001F680\U0001F680\n On par or outperforms on tasks like - MBPP, BFCL, Enterprise RAG, MT Bench, MixEval, IFEval and Multi-Challenge making it great for Agentic / Enterprise tasks.\n Competitive performance on academic benchmarks like AIME-24 AIME-25, AMC-23, MATH-500 and GPQA considering model size.\n"
70887082
overrides:
70897083
parameters:
70907084
model: ServiceNow-AI_Apriel-Nemotron-15b-Thinker-Q4_K_M.gguf
@@ -9013,8 +9007,8 @@
90139007
model: deepseek-r1-distill-llama-8b-Q4_K_M.gguf
90149008
files:
90159009
- filename: deepseek-r1-distill-llama-8b-Q4_K_M.gguf
9016-
sha256: f8eba201522ab44b79bc54166126bfaf836111ff4cbf2d13c59c3b57da10573b
90179010
uri: huggingface://unsloth/DeepSeek-R1-Distill-Llama-8B-GGUF/DeepSeek-R1-Distill-Llama-8B-Q4_K_M.gguf
9011+
sha256: 0addb1339a82385bcd973186cd80d18dcc71885d45eabd899781a118d03827d9
90189012
- !!merge <<: *llama31
90199013
name: "selene-1-mini-llama-3.1-8b"
90209014
icon: https://atla-ai.notion.site/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2Ff08e6e70-73af-4363-9621-90e906b92ebc%2F1bfb4316-1ce6-40a0-800c-253739cfcdeb%2Fatla_white3x.svg?table=block&id=17c309d1-7745-80f9-8f60-e755409acd8d&spaceId=f08e6e70-73af-4363-9621-90e906b92ebc&userId=&cache=v2

0 commit comments

Comments
 (0)