
Alibaba's ZeroSearch Teaches AI To Search Without Search Engines, Cuts Training Costs By 88% (venturebeat.com) 7
Alibaba Group researchers have developed "ZeroSearch," a technique that enables large language models to acquire search capabilities without using external search engines during training. The approach transforms LLMs into retrieval modules through supervised fine-tuning and employs a "curriculum-based rollout strategy" that gradually degrades generated document quality.
In tests across seven question-answering datasets, ZeroSearch matched or exceeded the performance [PDF] of models trained with real search engines. A 7B-parameter retrieval module achieved results comparable to Google Search, while a 14B-parameter version outperformed it. The cost savings are substantial: training with 64,000 search queries using Google Search via SerpAPI would cost approximately $586.70, compared to just $70.80 using a 14B-parameter simulation LLM on four A100 GPUs -- an 88% reduction.
The technique works with multiple model families including Qwen-2.5 and LLaMA-3.2. Researchers have released their code, datasets, and pre-trained models on GitHub and Hugging Face, potentially lowering barriers to entry for smaller AI companies developing sophisticated assistants.
In tests across seven question-answering datasets, ZeroSearch matched or exceeded the performance [PDF] of models trained with real search engines. A 7B-parameter retrieval module achieved results comparable to Google Search, while a 14B-parameter version outperformed it. The cost savings are substantial: training with 64,000 search queries using Google Search via SerpAPI would cost approximately $586.70, compared to just $70.80 using a 14B-parameter simulation LLM on four A100 GPUs -- an 88% reduction.
The technique works with multiple model families including Qwen-2.5 and LLaMA-3.2. Researchers have released their code, datasets, and pre-trained models on GitHub and Hugging Face, potentially lowering barriers to entry for smaller AI companies developing sophisticated assistants.
Wonder if they will re-introduce it on Aliexpress (Score:2)
Their 'smart' search function was so utterly annoying.
Re: (Score:2)
If they really pay one cent per search query, I think they have deeper problems.
Re: (Score:2)
That's not unusual. Look at the prices at openrouter:
"The web plugin uses your OpenRouter credits and charges $4 per 1000 results. By default, max_results set to 5, this comes out to a maximum of $0.02 per request, in addition to the LLM usage for the search result prompt tokens."
Re: (Score:2)
Isn't that just due to Google search being shit now?
Google search being shit is attributable mostly to Google being shit, with a nice layer of SEO frosting; SEO which they promoted to the detriment of the entire information sphere.
Let's go! (Score:1)
They do not replace search engines (Score:2)
They replace USING a search engine during training. The key point is training on that prompt: