Lucy

Overview

Lucy is a 1.7B parameter model built on Qwen3-1.7B, optimized for web search through tool calling. The model has been trained to work effectively with search APIs like Serper, enabling web search capabilities in resource-constrained environments.

Performance

SimpleQA Benchmark

Lucy achieves competitive performance on SimpleQA despite its small size:

Lucy SimpleQA Performance

The benchmark shows Lucy (1.7B) compared against models ranging from 4B to 600B+ parameters. While larger models generally perform better, Lucy demonstrates that effective web search integration can partially compensate for smaller model size.

Requirements

Memory:
- Minimum: 4GB RAM (with Q4 quantization)
- Recommended: 8GB RAM (with Q8 quantization)
Search API: Serper API key required for web search functionality
Hardware: Runs on CPU or GPU

Using Lucy

Quick Start

Download Jan Desktop
Download Lucy from the Hub
Configure Serper MCP with your API key
Start using web search through natural language

Demo

Lucy Demo

Deployment Options

Using vLLM:

vllm serve Menlo/Lucy-128k \
    --host 0.0.0.0 \
    --port 1234 \
    --enable-auto-tool-choice \
    --tool-call-parser hermes \
    --rope-scaling '{"rope_type":"yarn","factor":3.2,"original_max_position_embeddings":40960}' \
    --max-model-len 131072

Using llama.cpp:

llama-server model.gguf \
    --host 0.0.0.0 \
    --port 1234 \
    --rope-scaling yarn \
    --rope-scale 3.2 \
    --yarn-orig-ctx 40960

Recommended Parameters

Temperature: 0.7
Top-p: 0.9
Top-k: 20
Min-p: 0.0

What Lucy Does Well

Web Search Integration: Optimized to call search tools and process results
Small Footprint: 1.7B parameters means lower memory requirements
Tool Calling: Reliable function calling for search APIs

Limitations

Requires Internet: Web search functionality needs active connection
API Costs: Serper API has usage limits and costs
Context Processing: While supporting 128k context, performance may vary with very long inputs
General Knowledge: Limited by 1.7B parameter size for tasks beyond search

Models Available

Citation

@misc{dao2025lucyedgerunningagenticweb,
      title={Lucy: edgerunning agentic web search on mobile with machine generated task vectors},
      author={Alan Dao and Dinh Bach Vu and Alex Nguyen and Norapat Buppodom},
      year={2025},
      eprint={2508.00360},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.00360},
}