Examples

This section contains practical code examples demonstrating vLLM Client usage patterns.

Available Examples

Basic Usage

Example	Description
Basic Chat	Simple chat completion requests
Streaming Chat	Real-time streaming responses
Streaming Completions	Legacy completions streaming
Tool Calling	Function calling integration
Multi-modal	Image and multi-modal inputs

Quick Examples

Hello World

use vllm_client::{VllmClient, json};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = VllmClient::new("http://localhost:8000/v1");
    
    let response = client.chat.completions().create()
        .model("Qwen/Qwen2.5-7B-Instruct")
        .messages(json!([{"role": "user", "content": "Hello!"}]))
        .send()
        .await?;
    
    println!("{}", response.content.unwrap_or_default());
    Ok(())
}

Streaming Output

use vllm_client::{VllmClient, json, StreamEvent};
use futures::StreamExt;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = VllmClient::new("http://localhost:8000/v1");
    
    let mut stream = client.chat.completions().create()
        .model("Qwen/Qwen2.5-7B-Instruct")
        .messages(json!([{"role": "user", "content": "Tell me a story"}]))
        .stream(true)
        .send_stream()
        .await?;
    
    while let Some(event) = stream.next().await {
        if let StreamEvent::Content(delta) = event {
            print!("{}", delta);
        }
    }
    
    println!();
    Ok(())
}

Tool Calling

#![allow(unused)]
fn main() {
use vllm_client::{VllmClient, json};

let tools = json!([
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {"type": "string"}
                },
                "required": ["location"]
            }
        }
    }
]);

let response = client.chat.completions().create()
    .model("Qwen/Qwen2.5-7B-Instruct")
    .messages(json!([
        {"role": "user", "content": "What's the weather in Tokyo?"}
    ]))
    .tools(tools)
    .send()
    .await?;

if response.has_tool_calls() {
    // Execute tools and return results
}
}

Example Structure

Each example includes:

Complete, runnable code
Required dependencies
Step-by-step explanations
Common variations and use cases

Running Examples

Prerequisites

A running vLLM server:

pip install vllm
vllm serve Qwen/Qwen2.5-7B-Instruct --port 8000

Rust toolchain:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

Running an Example

# Create a new project
cargo new my-vllm-app
cd my-vllm-app

# Add dependencies
cargo add vllm-client
cargo add tokio --features full
cargo add serde_json

# Copy example code to src/main.rs
# Then run:
cargo run

Common Patterns

Environment Configuration

#![allow(unused)]
fn main() {
use std::env;
use vllm_client::VllmClient;

fn create_client() -> VllmClient {
    VllmClient::builder()
        .base_url(env::var("VLLM_BASE_URL")
            .unwrap_or_else(|_| "http://localhost:8000/v1".to_string()))
        .api_key(env::var("VLLM_API_KEY").ok())
        .timeout_secs(300)
        .build()
}
}

Error Handling

#![allow(unused)]
fn main() {
use vllm_client::{VllmClient, VllmError};

async fn safe_chat(prompt: &str) -> Result<String, VllmError> {
    let client = VllmClient::new("http://localhost:8000/v1");
    
    let response = client.chat.completions().create()
        .model("Qwen/Qwen2.5-7B-Instruct")
        .messages(json!([{"role": "user", "content": prompt}]))
        .send()
        .await?;
    
    Ok(response.content.unwrap_or_default())
}
}

Reusing Client

#![allow(unused)]
fn main() {
use std::sync::Arc;
use vllm_client::VllmClient;

// Share client across threads
let client = Arc::new(VllmClient::new("http://localhost:8000/v1"));

// Use in multiple async tasks
let client1 = Arc::clone(&client);
let client2 = Arc::clone(&client);
}

vLLM Client - Rust Client for vLLM API

Examples

Available Examples

Basic Usage

Quick Examples

Hello World

Streaming Output

Tool Calling

Example Structure

Running Examples

Prerequisites

Running an Example

Common Patterns

Environment Configuration

Error Handling

Reusing Client

See Also