AI is empowering many products. WebShell now integrates AI services as well.
Here is what I’ve built with AI so far and some TODOs.

Prompt engineering
If the AI service supports a system prompt, you can set it like this:
You are a Shell expert. I want to know how to complete tasks in the terminal or command line. I will ask shell-related questions and you return solutions. If asked who you are, reply with
I am the WebShell AI assistant. If asked non-shell questions, reply withNot shell-related, I don't know.If the AI service doesn’t support system prompts, craft a user prompt instead:
You are a Shell expert. I want to know how to complete tasks in the terminal or command line. I will ask shell-related questions and you return solutions. If asked who you are, reply with
I am the WebShell AI assistant. If asked non-shell questions, reply withNot shell-related, I don't know. My question is: {{userPrompt}}User prompt engineering can play a role similar to a system prompt.
Prompt language matters
Due to how LLMs work, the same prompt can yield different results depending on parameters like temperature. Language (Chinese vs English) also affects results. For domestic AI services, I use Chinese prompts.
Conversation context handling
Don’t send unlimited history. For example, keep the last 5 messages (2 turns + the latest user prompt). This ensures an odd number of messages.
A “turn” is one user + one assistant message. If a user asked something and then stopped the response before the assistant replied, don’t include that orphaned user message in the next context - keep 1:1 pairing.
Token costs
- AI usage is typically billed by tokens; both requests and responses consume tokens.
- To reduce costs, some APIs (e.g., OpenAI’s) support max token limits via
max_tokens. Some domestic APIs like Hunyuan/Wenxin don’t yet. - Streaming vs non-streaming: in streaming mode, canceling the stream can interrupt generation on some providers, reducing tokens. Not all providers stop generation promptly.
Streaming
Benefits:
- Better UX: the model starts returning tokens as soon as they’re generated, so users see partial content sooner. Total completion time is similar to non-streaming.
- Potentially lower token usage if generation stops quickly after canceling (provider-dependent).
TODO
Fine-tuning
Prompt engineering can shape answers but is often insufficient. Fine-tuning helps steer answers for a class of queries by shifting output probabilities based on many examples.
I haven’t done fine-tuning for WebShell yet - leaving this as a placeholder.
Resources
Final Thoughts
The AI service I use doesn’t support system prompts yet, so I’m composing user prompts, with imperfect results.

