From 7ffeeaba0f5e3aba4e5586eb300f6c82f55885ee Mon Sep 17 00:00:00 2001 From: PeterGriffinJin Date: Thu, 13 Mar 2025 14:00:41 +0000 Subject: [PATCH] modify readme --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index acb265b..2769492 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Search-R1: Train your LLMs to reason and call a search engine with reinforcement learning -Search-R1 is a reproduction of DeepSeek-R1(-Zero) methods for training reasoning and searching (tool-call) interleaved LLMs. We built upon [veRL](https://github.com/volcengine/verl). +Search-R1 is an extension of DeepSeek-R1(-Zero) methods for training reasoning and searching (tool-call) interleaved LLMs. We built upon [veRL](https://github.com/volcengine/verl). Through RL (rule-based outcome reward), the 3B **base** LLM (both Qwen2.5-3b-base and Llama3.2-3b-base) develops reasoning and search engine calling abilities all on its own.