imnotnoahhh/AI-Proxy-Worker

Fork 0

Files

Noah Qin 760dbfef8a 📚 Update documentation links to include file extensions and remove outdated wiki references

2025-08-17 20:08:11 +08:00

16 KiB

Raw Blame History

API Reference

🌍 Language / 语言

🇺🇸 English | 🇨🇳 中文

AI Proxy Worker provides a simple yet powerful RESTful API that is fully compatible with OpenAI's Chat Completions API format, allowing you to easily integrate it into existing projects.

Current Support: DeepSeek API (v1.0)
Future Plans: Multi-AI service provider support including OpenAI, Claude, Gemini (v2.0)

🌐 Basic Information

Base URL

https://your-worker.workers.dev

Authentication

Authorization: Bearer YOUR_PROXY_KEY

Content-Type

Content-Type: application/json

📚 API Endpoints

1. Health Check

Check service status and connectivity.

Request:

GET /

Response:

{
  "status": "ok",
  "service": "AI Proxy Worker",
  "timestamp": "2025-01-01T12:00:00.000Z"
}

Example:

curl https://your-worker.workers.dev/

2. Chat Completions

Interact with AI models, supports both streaming and non-streaming responses.

Request:

POST /chat

Headers:

Authorization: Bearer YOUR_PROXY_KEY
Content-Type: application/json
Accept: application/json  # Non-streaming
Accept: text/event-stream # Streaming

Request Body:

{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system",
      "content": "You are a helpful AI assistant."
    },
    {
      "role": "user", 
      "content": "Hello!"
    }
  ],
  "stream": false,
  "max_tokens": 2048
}

🤖 Supported Models

deepseek-chat

Use Case: General conversation and text generation
Architecture: Based on DeepSeek-V3 architecture
Features: Suitable for daily conversations, content creation, text understanding
Context Length: 64K tokens
Recommended Scenarios: General text generation, conversational applications

deepseek-reasoner

Use Case: Complex reasoning and logical thinking
Architecture: Based on DeepSeek-R1 architecture
Features: Math problems, logical reasoning, code analysis, complex reasoning
Context Length: 64K tokens
Recommended Scenarios: Tasks requiring deep thinking

Note

: Model specifications and capabilities may change with DeepSeek updates. Check DeepSeek Official Documentation for latest information.

📝 Request Parameters

Required Parameters

Parameter	Type	Description
`model`	string	Model name to use
`messages`	array	Array of conversation messages

Optional Parameters

Parameter	Type	Default	Description	Support Status
`stream`	boolean	false	Enable streaming response	✅ Fully supported
`max_tokens`	number	-	Maximum tokens to generate	✅ Fully supported
`temperature`	number	1.0	Control randomness (0-2)	⚠️ May not work
`top_p`	number	1.0	Nucleus sampling parameter (0-1)	⚠️ May not work
`frequency_penalty`	number	0	Frequency penalty (-2 to 2)	⚠️ May not work
`presence_penalty`	number	0	Presence penalty (-2 to 2)	⚠️ May not work
`stop`	array/string	null	Stop sequences	✅ Supported
`seed`	number	null	Random seed for consistent output	✅ Supported

Note

: Parameters marked "⚠️ May not work" may not have the expected effect due to DeepSeek API limitations. We recommend primarily using stream, max_tokens, stop, and seed parameters.

Messages Format

Each message object contains:

{
  "role": "user|assistant|system",
  "content": "Message content"
}

Role Descriptions:

system: System prompt, defines AI behavior
user: User input
assistant: AI response

📤 Response Format

Non-streaming Response

Success Response:

{
  "id": "chatcmpl-123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "deepseek-chat",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! I'm DeepSeek, happy to help you."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 20,
    "completion_tokens": 15,
    "total_tokens": 35
  }
}

Streaming Response

When stream: true is enabled, response is in Server-Sent Events (SSE) format:

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{"content":"!"},"finish_reason":null}]}

data: {"id":"chatcmpl-123","object":"chat.completion.chunk","created":1677652288,"model":"deepseek-chat","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

⚠️ Important: Parameter Compatibility

According to DeepSeek official documentation, the following parameters may not work as expected:

temperature - May be ignored, DeepSeek API may use fixed temperature values
top_p - May not work
frequency_penalty - May not work
presence_penalty - May not work

Recommended Approach:

Primarily use model, messages, max_tokens, stream, and stop parameters
To control generation behavior, use system messages to guide the model
You can try these parameters during testing, but don't rely on their effects

Example - Recommended Request Format:

{
  "model": "deepseek-chat",
  "messages": [
    {
      "role": "system", 
      "content": "Please answer concisely, don't be overly detailed."
    },
    {
      "role": "user",
      "content": "What is artificial intelligence?"
    }
  ],
  "max_tokens": 500,
  "stream": false
}

🔧 Complete Examples

cURL Examples

Non-streaming Request:

curl -X POST https://your-worker.workers.dev/chat \
  -H "Authorization: Bearer YOUR_PROXY_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-chat",
    "messages": [
      {
        "role": "system",
        "content": "You are a professional programming assistant."
      },
      {
        "role": "user",
        "content": "Please write a Python quicksort function for me."
      }
    ],
    "max_tokens": 1000
  }'

Streaming Request:

curl -X POST https://your-worker.workers.dev/chat \
  -H "Authorization: Bearer YOUR_PROXY_KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: text/event-stream" \
  -d '{
    "model": "deepseek-chat",
    "messages": [
      {"role": "user", "content": "Write a poem about programming"}
    ],
    "stream": true
  }'

JavaScript Examples

Basic Call:

async function callAI(message) {
  const response = await fetch('https://your-worker.workers.dev/chat', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_PROXY_KEY',
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      model: 'deepseek-chat',
      messages: [
        { role: 'user', content: message }
      ],
      max_tokens: 1000
    })
  });

  if (!response.ok) {
    throw new Error(`HTTP error! status: ${response.status}`);
  }

  const data = await response.json();
  return data.choices[0].message.content;
}

// Usage example
callAI('Hello, please introduce yourself')
  .then(result => console.log(result))
  .catch(error => console.error('Error:', error));

Streaming Call:

async function streamAI(message, onChunk) {
  const response = await fetch('https://your-worker.workers.dev/chat', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_PROXY_KEY',
      'Content-Type': 'application/json',
      'Accept': 'text/event-stream',
    },
    body: JSON.stringify({
      model: 'deepseek-chat',
      messages: [{ role: 'user', content: message }],
      stream: true
    })
  });

  const reader = response.body.getReader();
  const decoder = new TextDecoder();

  try {
    while (true) {
      const { done, value } = await reader.read();
      if (done) break;

      const chunk = decoder.decode(value);
      const lines = chunk.split('\n');

      for (const line of lines) {
        if (line.startsWith('data: ')) {
          const data = line.slice(6);
          if (data === '[DONE]') return;

          try {
            const parsed = JSON.parse(data);
            const content = parsed.choices[0]?.delta?.content;
            if (content) {
              onChunk(content);
            }
          } catch (e) {
            // Ignore parsing errors
          }
        }
      }
    }
  } finally {
    reader.releaseLock();
  }
}

// Usage example
streamAI('Write a story about AI', (chunk) => {
  process.stdout.write(chunk); // Real-time output
});

Python Examples

Basic Call:

import requests
import json

def call_ai(message, model="deepseek-chat"):
    url = "https://your-worker.workers.dev/chat"
    headers = {
        "Authorization": "Bearer YOUR_PROXY_KEY",
        "Content-Type": "application/json"
    }
    
    payload = {
        "model": model,
        "messages": [
            {"role": "user", "content": message}
        ],
        "max_tokens": 1000
    }
    
    response = requests.post(url, headers=headers, json=payload)
    response.raise_for_status()
    
    data = response.json()
    return data["choices"][0]["message"]["content"]

# Usage example
result = call_ai("Please explain what machine learning is")
print(result)

Streaming Call:

import requests
import json

def stream_ai(message, model="deepseek-chat"):
    url = "https://your-worker.workers.dev/chat"
    headers = {
        "Authorization": "Bearer YOUR_PROXY_KEY",
        "Content-Type": "application/json",
        "Accept": "text/event-stream"
    }
    
    payload = {
        "model": model,
        "messages": [{"role": "user", "content": message}],
        "stream": True
    }
    
    response = requests.post(url, headers=headers, json=payload, stream=True)
    response.raise_for_status()
    
    for line in response.iter_lines():
        if line:
            line = line.decode('utf-8')
            if line.startswith('data: '):
                data = line[6:]
                if data == '[DONE]':
                    break
                
                try:
                    parsed = json.loads(data)
                    content = parsed["choices"][0]["delta"].get("content")
                    if content:
                        print(content, end='', flush=True)
                except json.JSONDecodeError:
                    continue

# Usage example
stream_ai("Write a poem about spring")

iOS Swift Examples

import Foundation

class AIProxyClient {
    private let baseURL = "https://your-worker.workers.dev"
    private let apiKey = "YOUR_PROXY_KEY"
    
    func chatCompletion(
        model: String = "deepseek-chat",
        messages: [[String: String]],
        maxTokens: Int = 1000
    ) async throws -> String {
        
        guard let url = URL(string: "\(baseURL)/chat") else {
            throw APIError.invalidURL
        }
        
        var request = URLRequest(url: url)
        request.httpMethod = "POST"
        request.setValue("Bearer \(apiKey)", forHTTPHeaderField: "Authorization")
        request.setValue("application/json", forHTTPHeaderField: "Content-Type")
        
        let requestBody: [String: Any] = [
            "model": model,
            "messages": messages,
            "max_tokens": maxTokens
        ]
        
        request.httpBody = try JSONSerialization.data(withJSONObject: requestBody)
        
        let (data, response) = try await URLSession.shared.data(for: request)
        
        guard let httpResponse = response as? HTTPURLResponse,
              httpResponse.statusCode == 200 else {
            throw APIError.requestFailed
        }
        
        let result = try JSONSerialization.jsonObject(with: data) as! [String: Any]
        let choices = result["choices"] as! [[String: Any]]
        let message = choices[0]["message"] as! [String: Any]
        
        return message["content"] as! String
    }
}

enum APIError: Error {
    case invalidURL
    case requestFailed
}

// Usage example
let client = AIProxyClient()

Task {
    do {
        let response = try await client.chatCompletion(
            messages: [
                ["role": "user", "content": "Hello, please introduce yourself"]
            ]
        )
        print(response)
    } catch {
        print("Error: \(error)")
    }
}

❌ Error Handling

Error Response Format

All errors return a unified JSON format:

{
  "error": "error_type",
  "details": "Detailed error message",
  "timestamp": "2025-01-01T12:00:00.000Z"
}

Common Error Codes

HTTP Status	Error Type	Description
400	`invalid_request`	Request format error
401	`unauthorized`	Authentication failed
404	`not_found`	Endpoint not found
413	`payload_too_large`	Request body too large
500	`internal_error`	Internal server error
502	`upstream_error`	Upstream API error
504	`timeout`	Request timeout

Error Handling Example

async function handleAPICall(message) {
  try {
    const response = await fetch('https://your-worker.workers.dev/chat', {
      method: 'POST',
      headers: {
        'Authorization': 'Bearer YOUR_PROXY_KEY',
        'Content-Type': 'application/json',
      },
      body: JSON.stringify({
        model: 'deepseek-chat',
        messages: [{ role: 'user', content: message }]
      })
    });

    if (!response.ok) {
      const errorData = await response.json();
      throw new Error(`API Error (${response.status}): ${errorData.error} - ${errorData.details}`);
    }

    return await response.json();
  } catch (error) {
    console.error('API call failed:', error.message);
    
    // Handle different error types
    if (error.message.includes('401')) {
      console.log('Please check if API key is correct');
    } else if (error.message.includes('504')) {
      console.log('Request timeout, please try again later');
    } else if (error.message.includes('413')) {
      console.log('Request content too long, please reduce input');
    }
    
    throw error;
  }
}

🔒 Security Best Practices

1. API Key Management

Never hardcode PROXY_KEY in client code
Use environment variables or secure configuration management
Rotate keys regularly

2. Request Validation

Validate user input to prevent injection attacks
Limit request frequency to prevent abuse
Log and monitor abnormal requests

3. Content Filtering

function sanitizeInput(content) {
  // Remove potentially malicious content
  return content
    .replace(/<script[^>]*>.*?<\/script>/gi, '')
    .replace(/<[^>]*>/g, '')
    .trim();
}

const sanitizedMessage = sanitizeInput(userInput);

📊 Usage Limits

Cloudflare Workers Limits

Request Timeout: 30 seconds (configurable)
Request Body Size: 1MB (configurable)
Concurrent Requests: 1000/minute (free tier)
CPU Time: 10ms (free tier)

DeepSeek API Limits

Rate Limits: Based on your DeepSeek account plan
Context Length: 64K tokens
Concurrent Connections: Based on account type

🚀 Performance Optimization Tips

1. Caching Strategy

// Simple memory cache example
const cache = new Map();

function getCachedResponse(key) {
  const cached = cache.get(key);
  if (cached && Date.now() - cached.timestamp < 300000) { // 5-minute cache
    return cached.data;
  }
  return null;
}

2. Request Optimization

Set reasonable max_tokens to avoid unnecessarily long responses
Use appropriate temperature values
Use faster models for simple tasks

3. Streaming Response

Use streaming response for long text generation to improve user experience
Implement appropriate error retry mechanisms
Consider implementing request cancellation

Need More Help? 👉 View Usage Examples | Troubleshooting

16 KiB Raw Blame History