diff --git a/README.md b/README.md index acb265b..2769492 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Search-R1: Train your LLMs to reason and call a search engine with reinforcement learning -Search-R1 is a reproduction of DeepSeek-R1(-Zero) methods for training reasoning and searching (tool-call) interleaved LLMs. We built upon [veRL](https://github.com/volcengine/verl). +Search-R1 is an extension of DeepSeek-R1(-Zero) methods for training reasoning and searching (tool-call) interleaved LLMs. We built upon [veRL](https://github.com/volcengine/verl). Through RL (rule-based outcome reward), the 3B **base** LLM (both Qwen2.5-3b-base and Llama3.2-3b-base) develops reasoning and search engine calling abilities all on its own.