Commit e9e60dfa authored by Vũ Hoàng Anh's avatar Vũ Hoàng Anh

feat: Codex/Responses API compatibility, cache TTL 24h, prompt optimization,...

feat: Codex/Responses API compatibility, cache TTL 24h, prompt optimization, n8n test tools, gitignore cleanup
parent 90ea36b3
......@@ -2,6 +2,19 @@
"mcpServers": {
"canifa-api": {
"url": "http://localhost:5000/mcp"
},
"n8n-mcp": {
"command": "npx",
"args": [
"n8n-mcp"
],
"env": {
"MCP_MODE": "stdio",
"LOG_LEVEL": "error",
"DISABLE_CONSOLE_OUTPUT": "true",
"N8N_API_URL": "http://localhost:5678",
"N8N_API_KEY": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIwOTdkMTNhOS01NzQ0LTQyY2UtYTM5Yi00YjMwZTk4NDU4OWMiLCJpc3MiOiJuOG4iLCJhdWQiOiJwdWJsaWMtYXBpIiwianRpIjoiMTVmZmNlZjUtNzkzOC00MWU4LTg5NzktY2NhMWI0YzUzY2RmIiwiaWF0IjoxNzcyNjc1OTM3fQ.K58ZsX8BgdukDdON15sMCQ0eynTeYSEbi7nF6xIPY9I"
}
}
}
}
\ No newline at end of file
......@@ -55,3 +55,25 @@ Thumbs.db
run.txt
backend/agent/tools/query.txt
backend/schema_dump.json
# Document folder
document/
# n8n workflow exports & temp files
canifa_workflow_export.json
prod_workflow.json
prod_workflow_fixed.json
fix_n8n_connections.py
*.png
!backend/static/**/*.png
# Playwright MCP
.playwright-mcp/
# Test credentials (sensitive)
backend/tests/google_credentials.json
backend/tests/google_sheets_credentials.json
backend/tests/sheet_info.json
backend/tests/test_n8n_api_output.txt
backend/n8n_result.json
diff_*.txt
# 🔬 VERIFICATION: LangGraph Streaming Behavior
## 🎯 MỤC ĐÍCH
Kiểm tra xem LangGraph `astream()` có stream **incremental** (từng phần) hay chỉ emit event **sau khi node hoàn thành**.
---
## 📊 KẾT QUẢ EXPECTED
### **Scenario 1: Incremental Streaming (Lý tưởng)** ✅
Nếu LangGraph stream incremental, backend logs sẽ hiển thị:
```
🌊 Starting LLM streaming...
📦 Event #1 at t=2.50s | Keys: ['messages']
📦 Event #2 at t=3.20s | Keys: ['ai_response']
📡 Event #2 (t=3.20s): ai_response with 150 chars
Preview: {"ai_response": "Anh chọn áo thun th...
📦 Event #3 at t=4.10s | Keys: ['ai_response']
📡 Event #3 (t=4.10s): ai_response with 380 chars
Preview: {"ai_response": "Anh chọn áo thun thể thao nam chuẩn luôn! Em tìm...
📦 Event #4 at t=5.50s | Keys: ['ai_response']
📡 Event #4 (t=5.50s): ai_response with 620 chars
Preview: {"ai_response": "...", "product_ids": ["SKU1", "SKU2"]...
🎯 Event #4 (t=5.50s): Regex matched product_ids!
✅ Extracted 3 SKUs: ['SKU1', 'SKU2', 'SKU3']
🚨 BREAKING at Event #4 (t=5.50s) - user_insight KHÔNG ĐỢI!
```
**→ Content tăng dần (150 → 380 → 620 chars)**
**→ Break sớm khi có product_ids (t=5.5s thay vì t=12s)**
---
### **Scenario 2: Event-based (Sau khi xong)** ❌
Nếu LangGraph chỉ emit sau khi node xong, logs sẽ là:
```
🌊 Starting LLM streaming...
📦 Event #1 at t=2.30s | Keys: ['messages'] ← Tool execution
📦 Event #2 at t=11.80s | Keys: ['ai_response'] ← LLM node hoàn thành
📡 Event #2 (t=11.80s): ai_response with 1250 chars ← TOÀN BỘ RESPONSE
Preview: {"ai_response": "Anh chọn áo thun thể thao nam chuẩn luôn!...", "product_ids": ["SKU1", "SKU2", "SKU3"], "user_insight": {...}}
🎯 Event #2 (t=11.80s): Regex matched product_ids!
✅ Extracted 3 SKUs: ['SKU1', 'SKU2', 'SKU3']
🚨 BREAKING at Event #2 (t=11.80s) - user_insight KHÔNG ĐỢI!
```
**→ CHỈ 1 EVENT duy nhất với full content**
**→ Emit sau khi LLM xong hết (t=11.8s)**
**→ KHÔNG THỂ break sớm hơn!**
---
## 🔍 PHÂN TÍCH
### **Nếu Scenario 2 (Event-based):**
**Giải thích:**
- LLM **đang stream tokens internal** từ t=2s → t=12s
- LangGraph **chờ node xong** mới emit event
- Event chứa **full response** luôn
- Regex match ngay lập tức vì đã có đầy đủ
**Kết luận:**
- ✅ Code đã đúng, streaming đã bật
- ❌ Nhưng không thể break sớm hơn vì event chưa có
- ⏱️ Latency không giảm được (~12s)
---
## 💡 GIẢI PHÁP
Nếu kết quả là Scenario 2, muốn stream thực sự cần:
### **Option A: Custom Streaming Callback**
```python
from langchain.callbacks.base import AsyncCallbackHandler
class StreamingCallback(AsyncCallbackHandler):
async def on_llm_new_token(self, token: str, **kwargs):
# Accumulate và check regex
self.accumulated += token
if '"product_ids"' in self.accumulated:
# Trigger break somehow
pass
```
### **Option B: SSE Endpoint**
Stream events trực tiếp cho client, client tự parse
### **Option C: Giữ nguyên**
Code đã tối ưu trong giới hạn, accept latency
---
## 📝 NOTES
- **Streaming=True** trong LLM → LangChain stream tokens internal
- **graph.astream()** → Stream events, không phải tokens
- **Break early** chỉ có ý nghĩa nếu events emit incremental
**Hãy check logs backend để xác định scenario nào!**
......@@ -268,6 +268,8 @@ async def chat_controller(
# Extract ai_response from streaming content (fallbacks)
if early_response and not ai_text_response:
raw_content = streaming_callback.accumulated_content
# Strip Codex reasoning objects before parsing
raw_content = ProductIDStreamingCallback.strip_reasoning(raw_content)
if raw_content:
try:
raw_normalized = raw_content.replace("{{", "{").replace("}}", "}")
......@@ -284,10 +286,21 @@ async def chat_controller(
if not ai_text_response and all_accumulated_messages:
for msg in reversed(all_accumulated_messages):
if isinstance(msg, AIMessage) and msg.content:
ai_text_response = msg.content
# Responses API may return content as list
content = msg.content
if isinstance(content, list):
content = "".join(str(c.get("text", c) if isinstance(c, dict) else c) for c in content)
# Strip Codex reasoning objects
content = ProductIDStreamingCallback.strip_reasoning(content)
ai_text_response = content
break
# Parse JSON-wrapped ai_response
# Ensure ai_text_response is str (Responses API may return list)
if isinstance(ai_text_response, list):
ai_text_response = "".join(str(c.get("text", c) if isinstance(c, dict) else c) for c in ai_text_response)
# Strip Codex reasoning objects before JSON parse
ai_text_response = ProductIDStreamingCallback.strip_reasoning(ai_text_response)
if ai_text_response and ai_text_response.lstrip().startswith("{"):
try:
ai_normalized = ai_text_response.replace("{{", "{").replace("}}", "}")
......@@ -297,7 +310,13 @@ async def chat_controller(
if not final_product_ids and isinstance(ai_json.get("product_ids"), list):
final_product_ids = [str(s) for s in ai_json["product_ids"]]
except json.JSONDecodeError:
pass
# Regex fallback for Codex {{/}} braces that break JSON parse
ai_match = re.search(r'"ai_response"\s*:\s*"((?:[^"\\]|\\.)*)"\s*,\s*"product_ids"', ai_text_response, re.DOTALL)
if ai_match:
ai_text_response = ai_match.group(1).replace('\\"', '"').replace("\\n", "\n")
pid_match = re.search(r'"product_ids"\s*:\s*\[(.*?)\]', ai_text_response if not ai_match else ai_normalized, re.DOTALL)
if pid_match and not final_product_ids:
final_product_ids = re.findall(r'"([^"]+)"', pid_match.group(1))
# Extract & filter products
enriched_products = []
......
"""
Agent Helper Functions
Các hàm tiện ích cho chat controller.
"""
import json
import logging
import uuid
from decimal import Decimal
from langchain_core.messages import HumanMessage, ToolMessage
from langchain_core.runnables import RunnableConfig
from common.conversation_manager import ConversationManager
from common.langfuse_client import get_callback_handler
from common.starrocks_connection import get_db_connection
from .models import AgentState
import re
logger = logging.getLogger(__name__)
# ==============================================================================
# FORMAT PRODUCT RESULTS
# ==============================================================================
def _parse_description_text(desc: str) -> dict:
"""Parse description_text_full thành dict các field (backward compatibility)."""
result = {}
if not desc:
return result
name_match = re.search(r"product_name:\s*(.+?)\.(?:\s+master_color:|$)", desc)
if name_match:
result["product_name"] = name_match.group(1).strip()
thumb_match = re.search(r"product_image_url_thumbnail:\s*(https?://[^\s]+?)\.(?:\s+product_web_url:|$)", desc)
if thumb_match:
result["product_image_url_thumbnail"] = thumb_match.group(1).strip()
url_match = re.search(r"product_web_url:\s*(https?://[^\s]+?)\.(?:\s+description_text:|$)", desc)
if url_match:
result["product_web_url"] = url_match.group(1).strip()
color_match = re.search(r"master_color:\s*(.+?)\.(?:\s+product_image_url:|$)", desc)
if color_match:
result["master_color"] = color_match.group(1).strip()
return result
def format_product_results(products: list[dict]) -> list[dict]:
"""
Format products - GROUP by base SKU (magento_ref_code), with multiple colors.
Output format:
{
"sku": "1DS25S008",
"name": "Váy liền bé gái",
"colors": [
{"color": "Cam/ Orange", "color_code": "SO123", "url": "...", "thumbnail": "..."},
{"color": "Xanh/ Blue", "color_code": "SB456", "url": "...", "thumbnail": "..."}
],
"price": 349000,
"sale_price": 244000,
"description": "..."
}
"""
max_products = 15
grouped: dict[str, dict] = {} # {magento_ref_code: {product_info}}
for p in products:
# Extract product info
if p.get("product_name"):
name = p["product_name"]
color_name = p.get("master_color") or ""
thumb_url = p.get("product_image_url_thumbnail") or ""
web_url = p.get("product_web_url") or ""
# Fallback: parse from description_text_full if fields are empty
if not color_name or not thumb_url or not web_url:
parsed = _parse_description_text(p.get("description_text_full", ""))
color_name = color_name or parsed.get("master_color", "")
thumb_url = thumb_url or parsed.get("product_image_url_thumbnail", "")
web_url = web_url or parsed.get("product_web_url", "")
else:
desc_full = p.get("description_text_full", "")
parsed = _parse_description_text(desc_full)
name = parsed.get("product_name", "")
color_name = parsed.get("master_color", "")
thumb_url = parsed.get("product_image_url_thumbnail", "")
web_url = parsed.get("product_web_url", "")
original_price = p.get("original_price") or 0
sale_price = p.get("sale_price") or 0
magento_ref = p.get("magento_ref_code", "")
product_color_code = p.get("product_color_code", "")
# Extract color code from product_color_code (VD: 1DS25S008-SO123 → SO123)
color_code_only = ""
if product_color_code and "-" in product_color_code:
parts = product_color_code.split("-", 1)
color_code_only = parts[1] if len(parts) > 1 else ""
# Use magento_ref as base SKU for grouping
base_sku = magento_ref if magento_ref else product_color_code
if not base_sku:
continue
# Color variant info
color_variant = {
"color": color_name,
"color_code": color_code_only,
"url": web_url,
"thumbnail": thumb_url,
}
if base_sku in grouped:
# Add color to existing product
existing_colors = [c["color"] for c in grouped[base_sku]["colors"]]
if color_name and color_name not in existing_colors:
grouped[base_sku]["colors"].append(color_variant)
# Update price range if different
if sale_price and sale_price < grouped[base_sku].get("sale_price", float("inf")):
grouped[base_sku]["sale_price"] = int(sale_price)
else:
# New product - use first color's URL/thumbnail as default
product_entry = {
"sku": base_sku,
"name": name,
"color": color_name, # First color as default
"colors": [color_variant] if color_name else [],
"price": int(original_price),
"sale_price": int(sale_price) if sale_price else int(original_price),
"url": web_url, # First color's URL
"thumbnail_image_url": thumb_url, # First color's thumbnail
"description": (p.get("description_text") or "")[:200],
}
# Include sizes if available (pipe-separated → list)
size_scale = p.get("size_scale")
if size_scale:
product_entry["sizes"] = [s.strip() for s in size_scale.split("|") if s.strip()]
# Include quantity_sold if available (for best seller)
qty_sold = p.get("quantity_sold")
if qty_sold is not None:
product_entry["quantity_sold"] = int(qty_sold)
grouped[base_sku] = product_entry
formatted = list(grouped.values())[:max_products]
logger.info(f"📦 Formatted {len(formatted)} products (grouped by SKU)")
return formatted
def decimal_default(obj):
"""
JSON serializer for objects not serializable by default json code.
Handles Decimal objects.
"""
if isinstance(obj, Decimal):
return float(obj)
raise TypeError(f"Object of type {obj.__class__.__name__} is not JSON serializable")
def extract_product_ids(messages: list) -> list[dict]:
"""
Extract full product info from tool messages (data_retrieval_tool results).
Returns list of product objects with: sku, name, price, sale_price, url, thumbnail_image_url.
"""
products = []
seen_skus = set()
for msg in messages:
if isinstance(msg, ToolMessage):
try:
# Tool result is JSON string
tool_result = json.loads(msg.content)
# Check if tool returned products (new format with "results" wrapper)
if tool_result.get("status") == "success":
# Handle both direct "products" and nested "results" format
product_list = []
if "results" in tool_result:
results_data = tool_result["results"]
if results_data and isinstance(results_data, list):
# Check first item to determine format
first_item = results_data[0] if len(results_data) > 0 else {}
if isinstance(first_item, dict) and "products" in first_item:
# Nested format: {"results": [{"products": [...]}]}
for result_item in results_data:
product_list.extend(result_item.get("products", []))
else:
# Flat format: {"results": [product1, product2]} (Current)
product_list = results_data
elif "products" in tool_result:
# Legacy format: {"products": [...]}
product_list = tool_result["products"]
logger.warning(f"🛠️ [EXTRACT] Extracted {len(product_list)} products from tool")
for product in product_list:
# ⚡ FLATTEN: Tách variants thành separate products (mỗi màu = 1 product)
# Grouped: {"product_id": "...", "name": "...", "variants": [...]}
# → Tách thành N products (1 per variant)
# Check if grouped format (has variants)
if "variants" in product and product.get("variants"):
# Grouped product - expand EACH variant into separate product
product_name = product.get("name", "")
base_sku = product.get("product_id") # ✅ Giữ base SKU giống nhau
for variant in product["variants"]:
variant_sku = variant.get("sku")
# ✅ Use base_sku instead of variant_sku for consistency
display_sku = base_sku if base_sku else variant_sku
# Create unique key for dedup using variant SKU
dedup_key = variant_sku or display_sku
if dedup_key and dedup_key not in seen_skus:
seen_skus.add(dedup_key)
# ✅ Spread ALL variant fields first, then override sku + name
product_obj = {
**variant, # Copy all variant fields (color, price, discount, stock, url, thumbnail, etc.)
"sku": display_sku, # Override with base SKU
"name": product_name, # Override with product name
}
products.append(product_obj)
else:
# Flat format - use directly
sku = product.get("sku") or product.get("internal_ref_code")
if sku and sku not in seen_skus:
seen_skus.add(sku)
product_obj = {
"sku": sku,
"name": product.get("name", ""),
"color": product.get("color", ""),
"price": product.get("price", 0),
"sale_price": product.get("sale_price"),
"url": product.get("url", ""),
"thumbnail_image_url": product.get("thumbnail_image_url", ""),
}
products.append(product_obj)
except (json.JSONDecodeError, KeyError, TypeError) as e:
logger.debug(f"Could not parse tool message for products: {e}")
continue
return products
def parse_ai_response_fast(ai_raw_content: str) -> tuple[str, list[str], str | None]:
"""
FAST parse - Chỉ extract ai_response + product_ids từ JSON, KHÔNG query DB.
Trả về SKU list thay vì full product objects.
Returns:
tuple: (ai_text_response, product_skus, user_insight_json)
"""
import json
import re
ai_text_response = ai_raw_content
product_skus = []
user_insight = None
try:
ai_json = json.loads(ai_raw_content)
# Extract basic fields
ai_text_response = ai_json.get("ai_response", ai_raw_content)
explicit_skus = ai_json.get("product_ids", [])
raw_insight = ai_json.get("user_insight")
# Extract SKUs mentioned in text
mentioned_skus_in_text = set(re.findall(r"\[([A-Z0-9]+)\]", ai_text_response))
# Determine target SKUs
if explicit_skus and isinstance(explicit_skus, list):
product_skus = [str(s) for s in explicit_skus]
elif mentioned_skus_in_text:
product_skus = list(mentioned_skus_in_text)
# Convert user_insight to JSON string
if raw_insight:
if isinstance(raw_insight, dict):
user_insight = json.dumps(raw_insight, ensure_ascii=False, indent=2)
elif isinstance(raw_insight, str):
user_insight = raw_insight
logger.info(f"⚡ Fast parse: ai_response={len(ai_text_response)} chars, skus={product_skus}")
except (json.JSONDecodeError, TypeError) as e:
logger.warning(f"⚠️ Fast parse failed: {e}")
return ai_text_response, product_skus, user_insight
async def parse_ai_response_async(ai_raw_content: str, all_products: list) -> tuple[str, list, str | None]:
"""
Async version of parse_ai_response with DB fallback.
Parse AI response từ LLM output và map SKUs với product data.
Nếu SKU được mention nhưng không có trong all_products (context hiện tại),
sẽ query trực tiếp DB để lấy thông tin.
Flow:
- LLM trả về: {"ai_response": "...", "product_ids": ["SKU1"], ...}
- Map SKUs → enriched products từ context
- Nếu thiếu → Query DB
"""
import re
from .structured_models import ChatResponse
ai_text_response = ai_raw_content
final_products = []
user_insight = None
logger.info(f"🤖 Raw AI JSON: {ai_raw_content}")
try:
# Try to parse if it's a JSON string from LLM
ai_json = json.loads(ai_raw_content)
# === PYDANTIC VALIDATION ===
try:
# Try strict Pydantic validation
parsed_response = ChatResponse.model_validate(ai_json)
ai_text_response = parsed_response.ai_response
explicit_skus = parsed_response.product_ids
# Convert user_insight to dict/string for storage
if parsed_response.user_insight:
user_insight = parsed_response.user_insight.model_dump_json(indent=2)
logger.info("✅ Pydantic validation passed for ChatResponse")
except Exception as validation_error:
# Fallback to manual parsing if Pydantic fails
logger.warning(f"⚠️ Pydantic validation failed, using fallback: {validation_error}")
ai_text_response = ai_json.get("ai_response", ai_raw_content)
explicit_skus = ai_json.get("product_ids", [])
raw_insight = ai_json.get("user_insight")
if raw_insight:
if isinstance(raw_insight, dict):
user_insight = json.dumps(raw_insight, ensure_ascii=False, indent=2)
elif isinstance(raw_insight, str):
user_insight = raw_insight
# === CRITICAL: Filter/Fetch products ===
# Extract SKUs mentioned in ai_response text using regex pattern [SKU]
mentioned_skus_in_text = set(re.findall(r"\[([A-Z0-9]+)\]", ai_text_response))
logger.info(f"📝 SKUs mentioned in ai_response: {mentioned_skus_in_text}")
# Determine target SKUs
target_skus = set()
# 1. Use explicit SKUs if available and confirmed by text, OR just explicit
if explicit_skus and isinstance(explicit_skus, list):
# Optional: Filter explicit SKUs to only those actually in text to reduce hallucination
# But if explicit list is provided, we generally trust it unless we want strict text-match
if mentioned_skus_in_text:
explicit_set = set(str(s) for s in explicit_skus)
target_skus = explicit_set.intersection(mentioned_skus_in_text)
if not target_skus: # If intersection empty, fallback to text mentions
target_skus = mentioned_skus_in_text
else:
target_skus = set(str(s) for s in explicit_skus)
elif mentioned_skus_in_text:
# 2. If no explicit SKUs, use text mentions
target_skus = mentioned_skus_in_text
logger.info(f"🎯 Target SKUs to return: {target_skus}")
if target_skus:
# Build lookup from current context
product_lookup = {p["sku"]: p for p in all_products if p.get("sku")}
found_products = []
for sku in target_skus:
if sku in product_lookup:
found_products.append(product_lookup[sku])
else:
# SKU not in context (e.g., only stock check was called)
# Don't create dummy product - just skip
logger.debug(f"⚠️ SKU {sku} not in context (stock-only query?), skipping")
final_products = found_products
except (json.JSONDecodeError, TypeError) as e:
logger.warning(f"⚠️ Failed to parse AI response as JSON: {e}")
return ai_text_response, final_products, user_insight
def prepare_execution_context(query: str, user_id: str, history: list, images: list | None):
"""
Prepare initial state and execution config for the graph run.
Returns:
tuple: (initial_state, exec_config)
"""
initial_state: AgentState = {
"user_query": HumanMessage(content=query),
"messages": [HumanMessage(content=query)],
"history": history,
"user_id": user_id,
"images_embedding": [],
"ai_response": None,
}
run_id = str(uuid.uuid4())
# Metadata for LangChain (tags for logging/filtering)
metadata = {
"run_id": run_id,
"tags": "chatbot,production",
}
langfuse_handler = get_callback_handler()
exec_config = RunnableConfig(
configurable={
"user_id": user_id,
"transient_images": images or [],
"run_id": run_id,
},
run_id=run_id,
metadata=metadata,
callbacks=[langfuse_handler] if langfuse_handler else [],
)
return initial_state, exec_config
async def handle_post_chat_async(
memory: ConversationManager, identity_key: str, human_query: str, ai_response: dict | None
):
"""
Save chat history in background task after response is sent.
Lưu AI response dưới dạng JSON string.
"""
if ai_response:
try:
# Convert dict thành JSON string để lưu vào TEXT field
# Use decimal_default to handle Decimal types from DB
ai_response_json = json.dumps(ai_response, ensure_ascii=False, default=decimal_default)
await memory.save_conversation_turn(identity_key, human_query, ai_response_json)
logger.debug(f"Saved conversation for identity_key {identity_key}")
except Exception as e:
logger.error(f"Failed to save conversation for identity_key {identity_key}: {e}", exc_info=True)
"""
Agent Helper Functions
Các hàm tiện ích cho chat controller.
"""
import json
import logging
import uuid
from decimal import Decimal
from langchain_core.messages import HumanMessage, ToolMessage
from langchain_core.runnables import RunnableConfig
from common.conversation_manager import ConversationManager
from common.langfuse_client import get_callback_handler
from common.starrocks_connection import get_db_connection
from .models import AgentState
import re
logger = logging.getLogger(__name__)
# ==============================================================================
# FORMAT PRODUCT RESULTS
# ==============================================================================
def _parse_description_text(desc: str) -> dict:
"""Parse description_text_full thành dict các field (backward compatibility)."""
result = {}
if not desc:
return result
name_match = re.search(r"product_name:\s*(.+?)\.(?:\s+master_color:|$)", desc)
if name_match:
result["product_name"] = name_match.group(1).strip()
thumb_match = re.search(r"product_image_url_thumbnail:\s*(https?://[^\s]+?)\.(?:\s+product_web_url:|$)", desc)
if thumb_match:
result["product_image_url_thumbnail"] = thumb_match.group(1).strip()
url_match = re.search(r"product_web_url:\s*(https?://[^\s]+?)\.(?:\s+description_text:|$)", desc)
if url_match:
result["product_web_url"] = url_match.group(1).strip()
color_match = re.search(r"master_color:\s*(.+?)\.(?:\s+product_image_url:|$)", desc)
if color_match:
result["master_color"] = color_match.group(1).strip()
return result
def format_product_results(products: list[dict]) -> list[dict]:
"""
Format products - GROUP by base SKU (magento_ref_code), with multiple colors.
Output format:
{
"sku": "1DS25S008",
"name": "Váy liền bé gái",
"colors": [
{"color": "Cam/ Orange", "color_code": "SO123", "url": "...", "thumbnail": "..."},
{"color": "Xanh/ Blue", "color_code": "SB456", "url": "...", "thumbnail": "..."}
],
"price": 349000,
"sale_price": 244000,
"description": "..."
}
"""
max_products = 15
grouped: dict[str, dict] = {} # {magento_ref_code: {product_info}}
for p in products:
# Extract product info
if p.get("product_name"):
name = p["product_name"]
color_name = p.get("master_color") or ""
thumb_url = p.get("product_image_url_thumbnail") or ""
web_url = p.get("product_web_url") or ""
# Fallback: parse from description_text_full if fields are empty
if not color_name or not thumb_url or not web_url:
parsed = _parse_description_text(p.get("description_text_full", ""))
color_name = color_name or parsed.get("master_color", "")
thumb_url = thumb_url or parsed.get("product_image_url_thumbnail", "")
web_url = web_url or parsed.get("product_web_url", "")
else:
desc_full = p.get("description_text_full", "")
parsed = _parse_description_text(desc_full)
name = parsed.get("product_name", "")
color_name = parsed.get("master_color", "")
thumb_url = parsed.get("product_image_url_thumbnail", "")
web_url = parsed.get("product_web_url", "")
original_price = p.get("original_price") or 0
sale_price = p.get("sale_price") or 0
magento_ref = p.get("magento_ref_code", "")
product_color_code = p.get("product_color_code", "")
# Extract color code from product_color_code (VD: 1DS25S008-SO123 → SO123)
color_code_only = ""
if product_color_code and "-" in product_color_code:
parts = product_color_code.split("-", 1)
color_code_only = parts[1] if len(parts) > 1 else ""
# Use magento_ref as base SKU for grouping
base_sku = magento_ref if magento_ref else product_color_code
if not base_sku:
continue
# Color variant info
color_variant = {
"color": color_name,
"color_code": color_code_only,
"url": web_url,
"thumbnail": thumb_url,
}
if base_sku in grouped:
# Add color to existing product
existing_colors = [c["color"] for c in grouped[base_sku]["colors"]]
if color_name and color_name not in existing_colors:
grouped[base_sku]["colors"].append(color_variant)
# Update price range if different
if sale_price and sale_price < grouped[base_sku].get("sale_price", float("inf")):
grouped[base_sku]["sale_price"] = int(sale_price)
else:
# New product - use first color's URL/thumbnail as default
product_entry = {
"sku": base_sku,
"name": name,
"color": color_name, # First color as default
"colors": [color_variant] if color_name else [],
"price": int(original_price),
"sale_price": int(sale_price) if sale_price else int(original_price),
"url": web_url, # First color's URL
"thumbnail_image_url": thumb_url, # First color's thumbnail
"description": (p.get("description_text") or "")[:200],
}
# Include sizes if available (pipe-separated → list)
size_scale = p.get("size_scale")
if size_scale:
product_entry["sizes"] = [s.strip() for s in size_scale.split("|") if s.strip()]
# Include quantity_sold if available (for best seller)
qty_sold = p.get("quantity_sold")
if qty_sold is not None:
product_entry["quantity_sold"] = int(qty_sold)
grouped[base_sku] = product_entry
formatted = list(grouped.values())[:max_products]
logger.info(f"📦 Formatted {len(formatted)} products (grouped by SKU)")
return formatted
def decimal_default(obj):
"""
JSON serializer for objects not serializable by default json code.
Handles Decimal objects.
"""
if isinstance(obj, Decimal):
return float(obj)
raise TypeError(f"Object of type {obj.__class__.__name__} is not JSON serializable")
def extract_product_ids(messages: list) -> list[dict]:
"""
Extract full product info from tool messages (data_retrieval_tool results).
Returns list of product objects with: sku, name, price, sale_price, url, thumbnail_image_url.
"""
products = []
seen_skus = set()
for msg in messages:
if isinstance(msg, ToolMessage):
try:
# Tool result is JSON string
tool_result = json.loads(msg.content)
# Check if tool returned products (new format with "results" wrapper)
if tool_result.get("status") == "success":
# Handle both direct "products" and nested "results" format
product_list = []
if "results" in tool_result:
results_data = tool_result["results"]
if results_data and isinstance(results_data, list):
# Check first item to determine format
first_item = results_data[0] if len(results_data) > 0 else {}
if isinstance(first_item, dict) and "products" in first_item:
# Nested format: {"results": [{"products": [...]}]}
for result_item in results_data:
product_list.extend(result_item.get("products", []))
else:
# Flat format: {"results": [product1, product2]} (Current)
product_list = results_data
elif "products" in tool_result:
# Legacy format: {"products": [...]}
product_list = tool_result["products"]
logger.warning(f"🛠️ [EXTRACT] Extracted {len(product_list)} products from tool")
for product in product_list:
# ⚡ FLATTEN: Tách variants thành separate products (mỗi màu = 1 product)
# Grouped: {"product_id": "...", "name": "...", "variants": [...]}
# → Tách thành N products (1 per variant)
# Check if grouped format (has variants)
if "variants" in product and product.get("variants"):
# Grouped product - expand EACH variant into separate product
product_name = product.get("name", "")
base_sku = product.get("product_id") # ✅ Giữ base SKU giống nhau
for variant in product["variants"]:
variant_sku = variant.get("sku")
# ✅ Use base_sku instead of variant_sku for consistency
display_sku = base_sku if base_sku else variant_sku
# Create unique key for dedup using variant SKU
dedup_key = variant_sku or display_sku
if dedup_key and dedup_key not in seen_skus:
seen_skus.add(dedup_key)
# ✅ Spread ALL variant fields first, then override sku + name
product_obj = {
**variant, # Copy all variant fields (color, price, discount, stock, url, thumbnail, etc.)
"sku": display_sku, # Override with base SKU
"name": product_name, # Override with product name
}
products.append(product_obj)
else:
# Flat format - use directly
sku = product.get("sku") or product.get("internal_ref_code")
if sku and sku not in seen_skus:
seen_skus.add(sku)
product_obj = {
"sku": sku,
"name": product.get("name", ""),
"color": product.get("color", ""),
"price": product.get("price", 0),
"sale_price": product.get("sale_price"),
"url": product.get("url", ""),
"thumbnail_image_url": product.get("thumbnail_image_url", ""),
}
products.append(product_obj)
except (json.JSONDecodeError, KeyError, TypeError) as e:
logger.debug(f"Could not parse tool message for products: {e}")
continue
return products
def parse_ai_response_fast(ai_raw_content: str) -> tuple[str, list[str], str | None]:
"""
FAST parse - Chỉ extract ai_response + product_ids từ JSON, KHÔNG query DB.
Trả về SKU list thay vì full product objects.
Returns:
tuple: (ai_text_response, product_skus, user_insight_json)
"""
import json
import re
ai_text_response = ai_raw_content
product_skus = []
user_insight = None
try:
ai_json = json.loads(ai_raw_content)
# Extract basic fields
ai_text_response = ai_json.get("ai_response", ai_raw_content)
explicit_skus = ai_json.get("product_ids", [])
raw_insight = ai_json.get("user_insight")
# Extract SKUs mentioned in text
mentioned_skus_in_text = set(re.findall(r"\[([A-Z0-9]+)\]", ai_text_response))
# Determine target SKUs
if explicit_skus and isinstance(explicit_skus, list):
product_skus = [str(s) for s in explicit_skus]
elif mentioned_skus_in_text:
product_skus = list(mentioned_skus_in_text)
# Convert user_insight to JSON string
if raw_insight:
if isinstance(raw_insight, dict):
user_insight = json.dumps(raw_insight, ensure_ascii=False, indent=2)
elif isinstance(raw_insight, str):
user_insight = raw_insight
logger.info(f"⚡ Fast parse: ai_response={len(ai_text_response)} chars, skus={product_skus}")
except (json.JSONDecodeError, TypeError) as e:
logger.warning(f"⚠️ Fast parse failed: {e}")
return ai_text_response, product_skus, user_insight
async def parse_ai_response_async(ai_raw_content: str, all_products: list) -> tuple[str, list, str | None]:
"""
Async version of parse_ai_response with DB fallback.
Parse AI response từ LLM output và map SKUs với product data.
Nếu SKU được mention nhưng không có trong all_products (context hiện tại),
sẽ query trực tiếp DB để lấy thông tin.
Flow:
- LLM trả về: {"ai_response": "...", "product_ids": ["SKU1"], ...}
- Map SKUs → enriched products từ context
- Nếu thiếu → Query DB
"""
import re
from .structured_models import ChatResponse
ai_text_response = ai_raw_content
final_products = []
user_insight = None
logger.info(f"🤖 Raw AI JSON: {ai_raw_content}")
try:
# Try to parse if it's a JSON string from LLM
ai_json = json.loads(ai_raw_content)
# === PYDANTIC VALIDATION ===
try:
# Try strict Pydantic validation
parsed_response = ChatResponse.model_validate(ai_json)
ai_text_response = parsed_response.ai_response
explicit_skus = parsed_response.product_ids
# Convert user_insight to dict/string for storage
if parsed_response.user_insight:
user_insight = parsed_response.user_insight.model_dump_json(indent=2)
logger.info("✅ Pydantic validation passed for ChatResponse")
except Exception as validation_error:
# Fallback to manual parsing if Pydantic fails
logger.warning(f"⚠️ Pydantic validation failed, using fallback: {validation_error}")
ai_text_response = ai_json.get("ai_response", ai_raw_content)
explicit_skus = ai_json.get("product_ids", [])
raw_insight = ai_json.get("user_insight")
if raw_insight:
if isinstance(raw_insight, dict):
user_insight = json.dumps(raw_insight, ensure_ascii=False, indent=2)
elif isinstance(raw_insight, str):
user_insight = raw_insight
# === CRITICAL: Filter/Fetch products ===
# Extract SKUs mentioned in ai_response text using regex pattern [SKU]
mentioned_skus_in_text = set(re.findall(r"\[([A-Z0-9]+)\]", ai_text_response))
logger.info(f"📝 SKUs mentioned in ai_response: {mentioned_skus_in_text}")
target_skus = set()
# 1. Use explicit SKUs if available and confirmed by text, OR just explicit
if explicit_skus and isinstance(explicit_skus, list):
if mentioned_skus_in_text:
explicit_set = set(str(s) for s in explicit_skus)
target_skus = explicit_set.intersection(mentioned_skus_in_text)
if not target_skus:
target_skus = mentioned_skus_in_text
else:
target_skus = set(str(s) for s in explicit_skus)
elif mentioned_skus_in_text:
# 2. If no explicit SKUs, use text mentions
target_skus = mentioned_skus_in_text
logger.info(f"🎯 Target SKUs to return: {target_skus}")
if target_skus:
# Build lookup from current context
product_lookup = {p["sku"]: p for p in all_products if p.get("sku")}
found_products = []
for sku in target_skus:
if sku in product_lookup:
found_products.append(product_lookup[sku])
else:
# SKU not in context (e.g., only stock check was called)
# Don't create dummy product - just skip
logger.debug(f"⚠️ SKU {sku} not in context (stock-only query?), skipping")
final_products = found_products
except (json.JSONDecodeError, TypeError) as e:
logger.warning(f"⚠️ Failed to parse AI response as JSON: {e}")
return ai_text_response, final_products, user_insight
def prepare_execution_context(query: str, user_id: str, history: list, images: list | None):
"""
Prepare initial state and execution config for the graph run.
Returns:
tuple: (initial_state, exec_config)
"""
initial_state: AgentState = {
"user_query": HumanMessage(content=query),
"messages": [HumanMessage(content=query)],
"history": history,
"user_id": user_id,
"images_embedding": [],
"ai_response": None,
}
run_id = str(uuid.uuid4())
# Metadata for LangChain (tags for logging/filtering)
metadata = {
"run_id": run_id,
"tags": "chatbot,production",
}
langfuse_handler = get_callback_handler()
exec_config = RunnableConfig(
configurable={
"user_id": user_id,
"transient_images": images or [],
"run_id": run_id,
},
run_id=run_id,
metadata=metadata,
callbacks=[langfuse_handler] if langfuse_handler else [],
)
return initial_state, exec_config
async def handle_post_chat_async(
memory: ConversationManager, identity_key: str, human_query: str, ai_response: dict | None
):
"""
Save chat history in background task after response is sent.
Lưu AI response dưới dạng JSON string.
"""
if ai_response:
try:
# Convert dict thành JSON string để lưu vào TEXT field
# Use decimal_default to handle Decimal types from DB
ai_response_json = json.dumps(ai_response, ensure_ascii=False, default=decimal_default)
await memory.save_conversation_turn(identity_key, human_query, ai_response_json)
logger.debug(f"Saved conversation for identity_key {identity_key}")
except Exception as e:
logger.error(f"Failed to save conversation for identity_key {identity_key}: {e}", exc_info=True)
......@@ -26,14 +26,17 @@
**🛒 HƯỚNG DẪN ĐẶT HÀNG (BẮT BUỘC KHI KHÁCH HỎI CÁCH MUA):**
**Khi đã show sản phẩm ra (có product card):**
→ "Bạn bấm vào icon 🛒 **Giỏ hàng** ở góc dưới bên phải sản phẩm, chọn size, chọn màu rồi thêm vào giỏ hàng là đặt hàng được luôn nhé!"
**Khi ĐÃ show sản phẩm (có product card trong conversation):**
→ Nói khách bấm icon 🛒 ở góc dưới bên phải hình sản phẩm, chọn size + màu rồi thêm vào giỏ hàng.
→ Hỏi khách cần xem thêm SP khác không.
**Khi chưa show sản phẩm (hỏi chung "mua sao?"):**
→ "Bạn ghé **canifa.com** để xem sản phẩm nhé! Hoặc nói mình biết bạn đang tìm gì, mình tìm giúp luôn! 😊"
**Khi CHƯA show sản phẩm (conversation mới, chưa tìm SP):**
→ Hướng dẫn 5 bước: vào canifa.com/App → tìm SP → chọn size + màu → thêm giỏ hàng → thanh toán.
→ Hỏi khách cần mình tìm SP gì không.
⚠️ **QUAN TRỌNG:**
- Khi khách hỏi "mua sao?", "đặt hàng sao?", "làm sao để mua?", "mua ở đâu?" → Trả lời ĐÚNG theo 2 case trên
- Phải TỰ VIẾT câu trả lời tự nhiên theo ngữ cảnh, KHÔNG copy nguyên mẫu!
- **CHECK context** trước: đã show SP hay chưa → chọn case A hoặc B
- **KHÔNG** hướng dẫn vào website tìm mã SP khi đã có product card → chỉ cần bấm icon 🛒
- Sau khi giới thiệu SP ưng ý → nhắc khách bấm 🛒 để đặt hàng
......
......@@ -473,6 +473,17 @@ Trước khi trả lời, bạn phải đối chiếu kết quả từ tool vớ
- **LUÔN DÙNG NGOẶC KÉP `{{` và `}}` CHO TẤT CẢ JSON OUTPUT**
- **⛔ CẤM TỰ SUY DIỄN gender/age** khi user không nói rõ. "quần váy" → gender: null. "áo lót" → gender: null. CHỈ điền khi user NÓI RÕ!
**⛔⛔⛔ TỐI HẬU THƯ — HƯỚNG DẪN ĐẶT HÀNG ⛔⛔⛔**
- Khi khách hỏi "hướng dẫn đặt hàng" mà CHƯA show sản phẩm nào → Hướng dẫn vào canifa.com/App, tìm SP, chọn size + màu, thêm giỏ hàng, thanh toán
- Khi khách hỏi "hướng dẫn đặt hàng" mà ĐÃ show sản phẩm → Nói bấm icon 🛒 ở góc dưới bên phải hình SP
- ⛔ **CẤM** nhét câu "Nếu mình đã tìm được SP cho bạn rồi..." vào khi CHƯA tìm SP nào!
- ⛔ **CẤM copy nguyên mẫu** template! TỰ VIẾT tự nhiên theo context!
**⛔⛔⛔ TỐI HẬU THƯ — CẤM BỊA MÃ SKU ⛔⛔⛔**
- Chỉ dùng mã SKU ĐÚNG NGUYÊN từ data_retrieval_tool hoặc khách đưa
- ❌ CẤM tự thêm suffix: "6TE25S001" → KHÔNG ĐƯỢC bịa thành "6TE25S001-SZ001"
- Tool tự expand biến thể, bot KHÔNG cần tự ghép color code!
**⚡ QUY TẮC [LAST_ACTION] - QUAN TRỌNG:**
- **TRƯỚC KHI TRẢ LỜI** → Đọc `[LAST_ACTION]` từ insight turn trước để hiểu context
- **TỰ SUY RA** bước tiếp theo dựa trên LAST_ACTION + tin nhắn mới của khách
......@@ -510,6 +521,11 @@ Mình check ngay cho bạn! ⚡"
---
### "Hướng dẫn đặt hàng online"
"Bạn đang muốn đặt sản phẩm gì ạ? 🛒
Bạn cho mình biết để mình tư vấn và hỗ trợ
đặt hàng luôn cho tiện nha! 😄"
⚠️ PHÂN BIỆT 2 CASE — check context trước khi trả lời:
**CASE A: ĐÃ show SP trước đó** → Nói khách bấm icon 🛒 ở góc dưới bên phải hình SP, chọn size + màu, thêm giỏ hàng. Hỏi cần xem SP khác không.
**CASE B: CHƯA show SP** → Hướng dẫn các bước: vào canifa.com/App → tìm SP → chọn size + màu → thêm giỏ hàng → thanh toán. Hỏi cần tìm SP gì không.
⛔ **TỰ VIẾT** câu trả lời tự nhiên, **KHÔNG copy nguyên** mẫu! Mỗi lần trả lời phải khác nhau, tự nhiên như đang nói chuyện.
\ No newline at end of file
......@@ -18,9 +18,9 @@ logger = logging.getLogger(__name__)
LANGFUSE_SYSTEM_PROMPT_NAME = "canifa-stylist-system-prompt"
# Cache 5 phút — balance giữa update nhanh vs performance
# Gọi force_refresh_prompts() nếu cần update ngay lập tức
CACHE_TTL = 300
# Cache vĩnh viễn (24h) — chỉ refresh khi gọi force_refresh_prompts()
# Trước đó là 300s (5 phút), giờ giữ prompt trong RAM luôn
CACHE_TTL = 86400 # 24 hours — practically permanent
LANGFUSE_TOOL_PROMPT_MAP = {
"brand_knowledge_tool": "canifa-tool-brand-knowledge",
......
......@@ -19,6 +19,11 @@ class ProductIDStreamingCallback(AsyncCallbackHandler):
Khi có product_ids → trigger break ngay, không đợi user_insight!
"""
# Regex to match Codex reasoning objects like {'id': 'rs_...', 'type': 'reasoning', ...}
_REASONING_RE = re.compile(
r"\{['\"]id['\"]\s*:\s*['\"]rs_[^}]*['\"]type['\"]\s*:\s*['\"]reasoning['\"][^}]*\}",
)
def __init__(self):
self.accumulated_content = ""
self.product_ids_found = False
......@@ -26,16 +31,31 @@ class ProductIDStreamingCallback(AsyncCallbackHandler):
self.product_skus = []
self.product_found_event = asyncio.Event() # ✅ Event thay vì polling!
@staticmethod
def strip_reasoning(text: str) -> str:
"""Remove Codex reasoning objects from text."""
if not text or "reasoning" not in text:
return text
return ProductIDStreamingCallback._REASONING_RE.sub("", text).strip()
async def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
"""
Callback khi LLM sinh token mới.
Accumulate và check regex ngay!
"""
# Responses API may send token as list instead of str
if isinstance(token, list):
token = "".join(str(t) for t in token)
elif not isinstance(token, str):
token = str(token)
self.accumulated_content += token
# Check xem đã có product_ids chưa
if not self.product_ids_found:
product_match = re.search(r'"product_ids"\s*:\s*\[(.*?)\]', self.accumulated_content, re.DOTALL)
# Strip reasoning objects (Codex) + normalize {{/}} before regex matching
clean_content = self.strip_reasoning(self.accumulated_content)
clean_content = clean_content.replace("{{", "{").replace("}}", "}")
product_match = re.search(r'"product_ids"\s*:\s*\[(.*?)\]', clean_content, re.DOTALL)
if product_match:
logger.warning(f"🎯 FOUND product_ids at {len(self.accumulated_content)} chars!")
......@@ -44,7 +64,7 @@ class ProductIDStreamingCallback(AsyncCallbackHandler):
# Extract ai_response với regex robust hơn (handle escaped quotes)
ai_text_match = re.search(
r'"ai_response"\s*:\s*"((?:[^"\\\\]|\\\\.)*)"\s*,\s*"product_ids"',
self.accumulated_content,
clean_content,
re.DOTALL,
)
......
......@@ -20,11 +20,15 @@ QUY TẮC CỰC QUAN TRỌNG KHI GỌI TOOL:
- Chỉ tạo tool_call với đúng tham số, KHÔNG trả lời người dùng trong cùng message đó.
- Sau khi tool trả kết quả mới được sinh ai_response.
⛔ CẤM TUYỆT ĐỐI TỰ BỊA MÃ SKU:
- Truyền ĐÚNG NGUYÊN MÃ khách đưa, KHÔNG tự ghép/sáng tạo suffix.
⛔⛔⛔ TỐI HẬU THƯ — CẤM TUYỆT ĐỐI TỰ BỊA MÃ SKU ⛔⛔⛔
- Truyền ĐÚNG NGUYÊN MÃ từ data_retrieval_tool trả về hoặc khách đưa.
- KHÔNG ĐƯỢC tự ghép thêm suffix -SZ001, -SK010, -SW001 hay BẤT KỲ ký tự nào!
- Tool trả về sku="6TE25S001" → skus: "6TE25S001" (ĐÚNG)
❌ skus: "6TE25S001-SZ001" (SAI — BỊA MÃ!)
❌ skus: "6TE25S001-SK010" (SAI — BỊA MÃ!)
- Khách nói "6TS25S018 còn size S không?" → skus: "6TS25S018" (ĐÚNG)
- KHÔNG ĐƯỢC bịa thành "6TS25S018-SZ001" hay bất kỳ mã nào khách KHÔNG đưa.
- Nếu khách chỉ cho base code (VD: 6TS25S018) → truyền base code đó, tool sẽ tự expand.
❌ skus: "6TS25S018-SZ001" (SAI — BỊA!)
- Tool sẽ TỰ EXPAND ra tất cả biến thể từ DB, KHÔNG cần bot tự thêm color code!
----- VÍ DỤ CHI TIẾT -----
......
......@@ -74,7 +74,8 @@ Chỉ CHUẨN HÓA khi user dùng từ đồng nghĩa RÕ RÀNG (bảng mapping
📋 BẢNG MAPPING SYNONYM → TÊN DB (tool tự xử lý, LLM giữ nguyên từ user):
áo thun, áo thun ngắn tay, áo cổ v, áo cổ tym → Áo phông
áo cổ bẻ → Áo Polo
áo bra, áo ngực, áo quây → Áo lót
áo bra, áo bra active, bra → Áo bra active (liên quan: Áo lót)
áo ngực, áo quây → Áo lót (liên quan: Áo bra active)
áo gió, áo khoác mỏng → Áo khoác gió
áo croptop, croptop, baby tee, áo lửng, áo dáng ngắn → Áo Body
áo sát nách, tanktop, tank top, áo dây, áo 2 dây, áo hai dây → Áo ba lỗ
......@@ -205,11 +206,18 @@ CASE 10: "Áo khaki"
→ description: "product_name: Áo khaki. description_text: Áo chất liệu khaki form đẹp"
→ product_line_vn: "Áo"
CASE 11: "Áo lót" hoặc "Áo bra" (NHÓM SP LIÊN QUAN)
→ description: "product_name: Áo lót/Áo bra active. description_text: Áo lót. Áo bra active thoáng mát co giãn tốt"
→ product_name: "Áo lót/Áo bra active"
⚠️ KHÔNG tự suy gender/age! User nói "áo lót" chung → để null. Chỉ điền khi user NÓI RÕ (VD: "áo lót nữ" → women, "áo lót trẻ em" → kid)
⚠️ description_text PHẢI ghi CẢ 2 tên (Áo lót + Áo bra active) để semantic search tìm được cả 2 loại!
CASE 11: "Áo bra" → product_name PHẢI là "Áo bra" (tool tự resolve → Áo bra active + Áo lót)
→ description: "product_name: Áo bra. description_text: Áo bra active thể thao thoáng mát co giãn tốt hỗ trợ tập luyện"
→ product_name: "Áo bra"
→ product_line_vn: "Áo"
⚠️ KHÔNG tự suy gender/age! User nói "áo bra" chung → để null.
CASE 12: "Áo lót" → product_name PHẢI là "Áo lót" (tool tự resolve → Áo lót + Áo bra active)
→ description: "product_name: Áo lót. description_text: Áo lót thoáng mát mềm mại thoải mái"
→ product_name: "Áo lót"
→ product_line_vn: "Áo"
⚠️ KHÔNG tự suy gender/age! Chỉ điền khi user NÓI RÕ (VD: "áo lót nữ" → women, "áo lót trẻ em" → kid)
⚠️ Tool tự tìm CẢ 2 loại (Áo lót + Áo bra active) nhờ RELATED_LINES — LLM chỉ cần giữ đúng tên user nói!
═══════════════════════════════════════════════════════════════
🎉 DỊP LỄ / SỰ KIỆN — description_text ghi lý do + gợi ý phong cách
......
......@@ -63,7 +63,7 @@ class SearchItem(BaseModel):
gender_by_product: str | None = Field(
description="[SQL FILTER] Giới tính. GIÁ TRỊ HỢP LỆ: women, men, boy, girl, unisex, newborn",
)
age_by_product: str | None = Field(
age_by_product: str | None = Field(
description="[SQL FILTER] Độ tuổi. GIÁ TRỊ HỢP LỆ: adult, kid, others",
)
master_color: str | None = Field(
......
import asyncio
import json
import logging
import time
from fastapi import APIRouter, BackgroundTasks, HTTPException
from pydantic import BaseModel
from agent.tools.data_retrieval_tool import SearchItem, data_retrieval_tool
from agent.mock_controller import mock_chat_controller
logger = logging.getLogger(__name__)
router = APIRouter()
# --- HELPERS ---
async def retry_with_backoff(coro_fn, max_retries=3, backoff_factor=2):
"""Retry async function with exponential backoff"""
for attempt in range(max_retries):
try:
return await coro_fn()
except Exception as e:
if attempt == max_retries - 1:
raise
wait_time = backoff_factor**attempt
logger.warning(f"⚠️ Attempt {attempt + 1} failed: {e!s}, retrying in {wait_time}s...")
await asyncio.sleep(wait_time)
# --- MODELS ---
class MockQueryRequest(BaseModel):
user_query: str
user_id: str | None = "test_user"
session_id: str | None = None
images: list[str] | None = None
class MockDBRequest(BaseModel):
query: str | None = None
magento_ref_code: str | None = None
price_min: float | None = None
price_max: float | None = None
top_k: int = 10
class MockRetrieverRequest(BaseModel):
user_query: str
price_min: float | None = None
price_max: float | None = None
magento_ref_code: str | None = None
user_id: str | None = "test_user"
session_id: str | None = None
# --- MOCK LLM RESPONSES (không gọi OpenAI) ---
MOCK_AI_RESPONSES = [
"Dựa trên tìm kiếm của bạn, tôi tìm thấy các sản phẩm phù hợp với nhu cầu của bạn. Những mặt hàng này có chất lượng tốt và giá cả phải chăng.",
"Tôi gợi ý cho bạn những sản phẩm sau. Chúng đều là những lựa chọn phổ biến và nhận được đánh giá cao từ khách hàng.",
"Dựa trên tiêu chí tìm kiếm của bạn, đây là những sản phẩm tốt nhất mà tôi có thể giới thiệu.",
"Những sản phẩm này hoàn toàn phù hợp với yêu cầu của bạn. Hãy xem chi tiết để chọn sản phẩm yêu thích nhất.",
"Tôi đã tìm được các mặt hàng tuyệt vời cho bạn. Hãy kiểm tra chúng để tìm ra lựa chọn tốt nhất.",
]
# --- ENDPOINTS ---
@router.post("/api/mock/agent/chat", summary="Mock Agent Chat (Real Tools + Fake LLM)")
async def mock_chat(req: MockQueryRequest, background_tasks: BackgroundTasks):
"""
Mock Agent Chat using mock_chat_controller:
- ✅ Real embedding + vector search (data_retrieval_tool THẬT)
- ✅ Real products from StarRocks
- ❌ Fake LLM response (no OpenAI cost)
- Perfect for stress testing + end-to-end testing
"""
try:
logger.info(f"🚀 [Mock Agent Chat] Starting with query: {req.user_query}")
result = await mock_chat_controller(
query=req.user_query,
user_id=req.user_id or "test_user",
background_tasks=background_tasks,
images=req.images,
)
return {
"status": "success",
"user_query": req.user_query,
"user_id": req.user_id,
"session_id": req.session_id,
**result, # Include status, ai_response, product_ids, etc.
}
except Exception as e:
logger.error(f"❌ Error in mock agent chat: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Mock Agent Chat Error: {e!s}")
@router.post("/api/mock/db/search", summary="Real Data Retrieval Tool (Agent Tool)")
async def mock_db_search(req: MockDBRequest):
"""
Dùng `data_retrieval_tool` THẬT từ Agent:
- Nếu có magento_ref_code → CODE SEARCH (không cần embedding)
- Nếu có query → HYDE SEMANTIC SEARCH (embedding + vector search)
- Lọc theo giá nếu có price_min/price_max
- Trả về sản phẩm thực từ StarRocks
Format input giống SearchItem của agent tool.
"""
try:
logger.info("📍 Data Retrieval Tool called")
start_time = time.time()
# Xây dựng SearchItem từ request - include all required fields
search_item = SearchItem(
query=req.query or "sản phẩm",
magento_ref_code=req.magento_ref_code,
price_min=req.price_min,
price_max=req.price_max,
action="search",
# Metadata fields - all required with None default
gender_by_product=None,
age_by_product=None,
product_name=None,
style=None,
master_color=None,
season=None,
material_group=None,
fitting=None,
form_neckline=None,
form_sleeve=None,
)
logger.info(f"🔧 Search params: {search_item.dict(exclude_none=True)}")
# Gọi data_retrieval_tool THẬT với retry
result_json = await retry_with_backoff(
lambda: data_retrieval_tool.ainvoke({"searches": [search_item]}), max_retries=3
)
result = json.loads(result_json)
elapsed_time = time.time() - start_time
logger.info(f"✅ Data Retrieval completed in {elapsed_time:.3f}s")
return {
"status": result.get("status", "success"),
"search_params": search_item.dict(exclude_none=True),
"total_results": len(result.get("results", [{}])[0].get("products", [])),
"products": result.get("results", [{}])[0].get("products", []),
"processing_time_ms": round(elapsed_time * 1000, 2),
"raw_result": result,
}
except Exception as e:
logger.error(f"❌ Error in DB search: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"DB Search Error: {e!s}")
@router.post("/api/mock/retrieverdb", summary="Real Embedding + Real DB Vector Search")
@router.post("/api/mock/retriverdb", summary="Real Embedding + Real DB Vector Search (Legacy)")
async def mock_retriever_db(req: MockRetrieverRequest):
"""
API thực tế để test Retriever + DB Search (dùng agent tool):
- Lấy query từ user
- Embedding THẬT (gọi OpenAI embedding trong tool)
- Vector search THẬT trong StarRocks
- Trả về kết quả sản phẩm thực (bỏ qua LLM)
Dùng để test performance của embedding + vector search riêng biệt.
"""
try:
logger.info(f"📍 Retriever DB started: {req.user_query}")
start_time = time.time()
# Xây dựng SearchItem từ request - include all required fields
search_item = SearchItem(
query=req.user_query,
magento_ref_code=req.magento_ref_code,
price_min=req.price_min,
price_max=req.price_max,
action="search",
# Metadata fields - all required with None default
gender_by_product=None,
age_by_product=None,
product_name=None,
style=None,
master_color=None,
season=None,
material_group=None,
fitting=None,
form_neckline=None,
form_sleeve=None,
)
logger.info(f"🔧 Retriever params: {search_item.dict(exclude_none=True)}")
# Gọi data_retrieval_tool THẬT (embedding + vector search) với retry
result_json = await retry_with_backoff(
lambda: data_retrieval_tool.ainvoke({"searches": [search_item]}), max_retries=3
)
result = json.loads(result_json)
elapsed_time = time.time() - start_time
logger.info(f"✅ Retriever completed in {elapsed_time:.3f}s")
# Parse kết quả
search_results = result.get("results", [{}])[0]
products = search_results.get("products", [])
return {
"status": result.get("status", "success"),
"user_query": req.user_query,
"user_id": req.user_id,
"session_id": req.session_id,
"search_params": search_item.dict(exclude_none=True),
"total_results": len(products),
"products": products,
"processing_time_ms": round(elapsed_time * 1000, 2),
}
except Exception as e:
logger.error(f"❌ Error in retriever DB: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Retriever DB Error: {e!s}")
import asyncio
import json
import logging
import time
from fastapi import APIRouter, BackgroundTasks, HTTPException
from pydantic import BaseModel
from agent.tools.data_retrieval_tool import SearchItem, data_retrieval_tool
from agent.mock_controller import mock_chat_controller
logger = logging.getLogger(__name__)
router = APIRouter()
# --- HELPERS ---
async def retry_with_backoff(coro_fn, max_retries=3, backoff_factor=2):
"""Retry async function with exponential backoff"""
for attempt in range(max_retries):
try:
return await coro_fn()
except Exception as e:
if attempt == max_retries - 1:
raise
wait_time = backoff_factor**attempt
logger.warning(f"⚠️ Attempt {attempt + 1} failed: {e!s}, retrying in {wait_time}s...")
await asyncio.sleep(wait_time)
# --- MODELS ---
class MockQueryRequest(BaseModel):
user_query: str
user_id: str | None = "test_user"
session_id: str | None = None
images: list[str] | None = None
class MockDBRequest(BaseModel):
query: str | None = None
magento_ref_code: str | None = None
price_min: float | None = None
price_max: float | None = None
top_k: int = 10
class MockRetrieverRequest(BaseModel):
user_query: str
price_min: float | None = None
price_max: float | None = None
magento_ref_code: str | None = None
user_id: str | None = "test_user"
session_id: str | None = None
# --- MOCK LLM RESPONSES (không gọi OpenAI) ---
MOCK_AI_RESPONSES = [
"Dựa trên tìm kiếm của bạn, tôi tìm thấy các sản phẩm phù hợp với nhu cầu của bạn. Những mặt hàng này có chất lượng tốt và giá cả phải chăng.",
"Tôi gợi ý cho bạn những sản phẩm sau. Chúng đều là những lựa chọn phổ biến và nhận được đánh giá cao từ khách hàng.",
"Dựa trên tiêu chí tìm kiếm của bạn, đây là những sản phẩm tốt nhất mà tôi có thể giới thiệu.",
"Những sản phẩm này hoàn toàn phù hợp với yêu cầu của bạn. Hãy xem chi tiết để chọn sản phẩm yêu thích nhất.",
"Tôi đã tìm được các mặt hàng tuyệt vời cho bạn. Hãy kiểm tra chúng để tìm ra lựa chọn tốt nhất.",
]
# --- ENDPOINTS ---
@router.post("/api/mock/agent/chat", summary="Mock Agent Chat (Real Tools + Fake LLM)")
async def mock_chat(req: MockQueryRequest, background_tasks: BackgroundTasks):
"""
Mock Agent Chat using mock_chat_controller:
- ✅ Real embedding + vector search (data_retrieval_tool THẬT)
- ✅ Real products from StarRocks
- ❌ Fake LLM response (no OpenAI cost)
- Perfect for stress testing + end-to-end testing
"""
try:
logger.info(f"🚀 [Mock Agent Chat] Starting with query: {req.user_query}")
result = await mock_chat_controller(
query=req.user_query,
user_id=req.user_id or "test_user",
background_tasks=background_tasks,
images=req.images,
)
return {
"status": "success",
"user_query": req.user_query,
"user_id": req.user_id,
"session_id": req.session_id,
**result, # Include status, ai_response, product_ids, etc.
}
except Exception as e:
logger.error(f"❌ Error in mock agent chat: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Mock Agent Chat Error: {e!s}")
@router.post("/api/mock/db/search", summary="Real Data Retrieval Tool (Agent Tool)")
async def mock_db_search(req: MockDBRequest):
"""
Dùng `data_retrieval_tool` THẬT từ Agent:
- Nếu có magento_ref_code → CODE SEARCH (không cần embedding)
- Nếu có query → HYDE SEMANTIC SEARCH (embedding + vector search)
- Lọc theo giá nếu có price_min/price_max
- Trả về sản phẩm thực từ StarRocks
Format input giống SearchItem của agent tool.
"""
try:
logger.info("📍 Data Retrieval Tool called")
start_time = time.time()
# Xây dựng SearchItem từ request - pass all required fields
search_item = SearchItem(
description=f"product_name: {req.query or 'sản phẩm'}. product_line_vn: {req.query or 'sản phẩm'}",
product_name=None,
magento_ref_code=req.magento_ref_code,
gender_by_product=None,
age_by_product=None,
master_color=None,
price_min=req.price_min,
price_max=req.price_max,
discount_min=None,
discount_max=None,
discovery_mode=None,
)
logger.info(f"🔧 Search params: {search_item.dict(exclude_none=True)}")
# Gọi data_retrieval_tool THẬT với retry
result_json = await retry_with_backoff(
lambda: data_retrieval_tool.ainvoke({"searches": [search_item]}), max_retries=3
)
result = json.loads(result_json)
elapsed_time = time.time() - start_time
logger.info(f"✅ Data Retrieval completed in {elapsed_time:.3f}s")
return {
"status": result.get("status", "success"),
"search_params": search_item.dict(exclude_none=True),
"total_results": len(result.get("results", [{}])[0].get("products", [])),
"products": result.get("results", [{}])[0].get("products", []),
"processing_time_ms": round(elapsed_time * 1000, 2),
"raw_result": result,
}
except Exception as e:
logger.error(f"❌ Error in DB search: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"DB Search Error: {e!s}")
@router.post("/api/mock/retrieverdb", summary="Real Embedding + Real DB Vector Search")
@router.post("/api/mock/retriverdb", summary="Real Embedding + Real DB Vector Search (Legacy)")
async def mock_retriever_db(req: MockRetrieverRequest):
"""
API thực tế để test Retriever + DB Search (dùng agent tool):
- Lấy query từ user
- Embedding THẬT (gọi OpenAI embedding trong tool)
- Vector search THẬT trong StarRocks
- Trả về kết quả sản phẩm thực (bỏ qua LLM)
Dùng để test performance của embedding + vector search riêng biệt.
"""
try:
logger.info(f"📍 Retriever DB started: {req.user_query}")
start_time = time.time()
# Xây dựng SearchItem từ request - pass all required fields
search_item = SearchItem(
description=f"product_name: {req.user_query}. product_line_vn: {req.user_query}",
product_name=None,
magento_ref_code=req.magento_ref_code,
gender_by_product=None,
age_by_product=None,
master_color=None,
price_min=req.price_min,
price_max=req.price_max,
discount_min=None,
discount_max=None,
discovery_mode=None,
)
logger.info(f"🔧 Retriever params: {search_item.dict(exclude_none=True)}")
# Gọi data_retrieval_tool THẬT (embedding + vector search) với retry
result_json = await retry_with_backoff(
lambda: data_retrieval_tool.ainvoke({"searches": [search_item]}), max_retries=3
)
result = json.loads(result_json)
elapsed_time = time.time() - start_time
logger.info(f"✅ Retriever completed in {elapsed_time:.3f}s")
# Parse kết quả
search_results = result.get("results", [{}])[0]
products = search_results.get("products", [])
return {
"status": result.get("status", "success"),
"user_query": req.user_query,
"user_id": req.user_id,
"session_id": req.session_id,
"search_params": search_item.dict(exclude_none=True),
"total_results": len(products),
"products": products,
"processing_time_ms": round(elapsed_time * 1000, 2),
}
except Exception as e:
logger.error(f"❌ Error in retriever DB: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"Retriever DB Error: {e!s}")
import logging
from fastapi import APIRouter, HTTPException
import re
from datetime import datetime
import httpx
from fastapi import APIRouter, HTTPException, Query
from pydantic import BaseModel, Field
from common.starrocks_connection import get_db_connection
logger = logging.getLogger(__name__)
router = APIRouter()
router = APIRouter(tags=["n8n-agent-tools"])
TABLE_NAME = "shared_source.magento_product_dimension_with_text_embedding"
STORE_TABLE = "shared_source.chatbot_rsa_store_schedule_with_text_embedding"
KNOWLEDGE_TABLE = "shared_source.chatbot_rsa_knowledge"
PROMOTION_TABLE = "shared_source.chatbot_rsa_salerule_with_text_embedding"
CANIFA_STOCK_API = "https://canifa.com/v1/middleware/stock_get_stock_list_parent"
SAFE_TEXT_PATTERN = re.compile(r"[^a-zA-Z0-9À-ỹ\s-]")
def _sanitize_text(value: str | None) -> str:
if not value:
return ""
return SAFE_TEXT_PATTERN.sub("", value).strip()
def _name_contains(product_name: str | None, query: str) -> bool:
return query.lower() in (product_name or "").lower()
def _build_verify_result(candidates: list[dict], sku: str, product_name: str) -> tuple[bool, str, list[dict]]:
sku_matches = [item for item in candidates if sku and (item.get("magento_ref_code") == sku)]
name_matches = [item for item in candidates if product_name and _name_contains(item.get("product_name"), product_name)]
if sku and product_name:
matched_products = [item for item in sku_matches if _name_contains(item.get("product_name"), product_name)]
if matched_products:
return True, "sku_name_match", matched_products
if sku_matches:
return False, "sku_found_name_mismatch", sku_matches
if name_matches:
return False, "name_found_sku_mismatch", name_matches
return False, "no_match", []
if sku:
is_valid = bool(sku_matches)
return is_valid, ("sku_match" if is_valid else "sku_not_found"), sku_matches
is_valid = bool(name_matches)
return is_valid, ("name_match" if is_valid else "name_not_found"), name_matches
@router.get("/api/agent/n8n/products", summary="N8N Specific: Get Sample Products")
async def n8n_get_sample_products(limit: int = 10):
class ProductVerifyRequest(BaseModel):
sku: str | None = Field(default=None, description="Magento reference code, ví dụ: 6ST24S001")
product_name: str | None = Field(default=None, description="Tên sản phẩm cần đối chiếu")
limit: int = Field(default=5, ge=1, le=20, description="Số bản ghi trả về để agent đối chiếu")
@router.get("/api/agent/n8n/products", summary="N8N Specific: Get Sample Products or Search")
async def n8n_get_sample_products(
limit: int = Query(default=10, ge=1, le=50),
q: str | None = Query(default=None, max_length=120),
):
"""
API DÀNH RIÊNG CHO N8N để lấy danh sách sản phẩm thực tế làm data cho AI sinh câu hỏi.
API DÀNH RIÊNG CHO N8N để lấy danh sách sản phẩm hoặc tìm kiếm (phục vụ AI Agent verify).
- Có thể truyền ?q=SKU hoặc từ khóa để tìm sản phẩm cụ thể.
- Code hoàn toàn tách biệt khỏi hệ thống cũ (không đụng vào logic SearchItem/Embedding)
- Trả về danh sách ngẫu nhiên từ StarRocks
"""
try:
from common.starrocks_connection import get_db_connection
db = get_db_connection()
mode = "sample"
normalized_q = None
if q is not None:
normalized_q = _sanitize_text(q)
if not normalized_q:
raise HTTPException(status_code=400, detail="q không hợp lệ sau khi sanitize.")
mode = "search"
# Lấy các sản phẩm có hiển thị trên web, có giá
if normalized_q:
query = """
SELECT
magento_ref_code,
product_name,
description_text,
original_price,
sale_price,
master_color,
gender_by_product,
product_web_url,
product_image_url_thumbnail
FROM shared_source.magento_product_dimension_with_text_embedding
WHERE sale_price > 0
AND quantity_sold > 0
AND (magento_ref_code = %s OR LOWER(product_name) LIKE LOWER(%s))
LIMIT %s
"""
params: tuple[object, ...] = (normalized_q, f"%{normalized_q}%", limit)
else:
query = """
SELECT
magento_ref_code,
product_name,
description_text,
original_price,
sale_price,
master_color,
gender_by_product,
product_web_url,
product_image_url_thumbnail
FROM shared_source.magento_product_dimension_with_text_embedding
WHERE sale_price > 0
AND quantity_sold > 0
ORDER BY rand()
LIMIT %s
"""
params = (limit,)
products = await db.execute_query_async(query, params=params)
return {
"status": "success",
"mode": mode,
"query": q,
"normalized_query": normalized_q,
"total": len(products),
"products": products,
}
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error in N8N Dedicated Product Fetch API: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"N8N API Error: {e!s}") from e
@router.post("/api/agent/n8n/products/verify", summary="N8N Specific: Verify product by SKU/name")
async def n8n_verify_product(req: ProductVerifyRequest):
"""
Endpoint verify cho AI Agent (n8n):
- Nhận SKU và/hoặc tên sản phẩm.
- Trả về cờ is_valid + reason để agent quyết định.
"""
try:
db = get_db_connection()
sku = _sanitize_text(req.sku)
product_name = _sanitize_text(req.product_name)
if not sku and not product_name:
raise HTTPException(status_code=400, detail="Cần truyền ít nhất một trong hai: sku hoặc product_name.")
if req.sku is not None and not sku:
raise HTTPException(status_code=400, detail="sku không hợp lệ sau khi sanitize.")
if req.product_name is not None and not product_name:
raise HTTPException(status_code=400, detail="product_name không hợp lệ sau khi sanitize.")
if sku and product_name:
query = """
SELECT
magento_ref_code,
product_name,
description_text,
original_price,
sale_price,
master_color,
gender_by_product,
product_web_url,
product_image_url_thumbnail
FROM shared_source.magento_product_dimension_with_text_embedding
WHERE sale_price > 0
AND quantity_sold > 0
AND (
magento_ref_code = %s
OR LOWER(product_name) LIKE LOWER(%s)
)
ORDER BY quantity_sold DESC
LIMIT %s
"""
params: tuple[object, ...] = (sku, f"%{product_name}%", req.limit)
elif sku:
query = """
SELECT
magento_ref_code,
product_name,
description_text,
original_price,
sale_price,
master_color,
gender_by_product,
product_web_url,
product_image_url_thumbnail
FROM shared_source.magento_product_dimension_with_text_embedding
WHERE sale_price > 0
AND quantity_sold > 0
AND magento_ref_code = %s
ORDER BY quantity_sold DESC
LIMIT %s
"""
params = (sku, req.limit)
else:
query = """
SELECT
magento_ref_code,
product_name,
description_text,
original_price,
sale_price,
master_color,
gender_by_product,
product_web_url,
product_image_url_thumbnail
FROM shared_source.magento_product_dimension_with_text_embedding
WHERE sale_price > 0
AND quantity_sold > 0
AND LOWER(product_name) LIKE LOWER(%s)
ORDER BY quantity_sold DESC
LIMIT %s
"""
params = (f"%{product_name}%", req.limit)
candidates = await db.execute_query_async(query, params=params)
is_valid, reason, matched_products = _build_verify_result(candidates, sku, product_name)
return {
"status": "success",
"is_valid": is_valid,
"reason": reason,
"input": {
"sku": req.sku,
"product_name": req.product_name,
},
"normalized_input": {
"sku": sku or None,
"product_name": product_name or None,
},
"candidate_count": len(candidates),
"candidates": candidates,
"matched_count": len(matched_products),
"matched_products": matched_products,
}
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error in N8N Product Verify API: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"N8N Verify API Error: {e!s}") from e
# =====================================================================
# TOOL 3: CHECK STOCK (Mirror of check_is_stock tool)
# =====================================================================
@router.get("/api/agent/n8n/stock", summary="N8N: Check product stock via Canifa API")
async def n8n_check_stock(
sku: str = Query(..., max_length=120, description="Mã sản phẩm (base code hoặc product_color_code). Nhiều mã cách nhau bằng dấu phẩy."),
):
"""
Proxy tới API tồn kho của canifa.com.
Nhận SKU (VD: 6TS25S018 hoặc 5TS25S023-SY322), trả về thông tin còn hàng theo size.
"""
try:
safe_sku = _sanitize_text(sku)
if not safe_sku:
raise HTTPException(status_code=400, detail="SKU không hợp lệ.")
sku_list = [s.strip() for s in safe_sku.split(",") if s.strip()]
if not sku_list:
raise HTTPException(status_code=400, detail="Không có mã sản phẩm hợp lệ.")
sku_string = ",".join(sku_list)
url = f"{CANIFA_STOCK_API}?skus={sku_string}"
async with httpx.AsyncClient(timeout=8.0) as client:
resp = await client.get(url)
resp.raise_for_status()
stock_data = resp.json()
return {
"status": "success",
"input_skus": sku_list,
"stock_data": stock_data,
}
except HTTPException:
raise
except httpx.TimeoutException:
raise HTTPException(status_code=504, detail="Canifa Stock API timeout.")
except Exception as e:
logger.error(f"❌ Error in N8N Stock API: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"N8N Stock API Error: {e!s}") from e
# =====================================================================
# TOOL 4: PROMOTIONS (Mirror of canifa_get_promotions tool)
# =====================================================================
@router.get("/api/agent/n8n/promotions", summary="N8N: Get active promotions")
async def n8n_get_promotions(
date: str | None = Query(default=None, max_length=10, description="Ngày kiểm tra (YYYY-MM-DD). Mặc định: hôm nay."),
):
"""
Tra cứu các chương trình khuyến mãi đang diễn ra theo ngày.
"""
try:
target_date = date or datetime.now().strftime("%Y-%m-%d")
db = get_db_connection()
# Lấy random các sản phẩm có hiển thị trên web, có giá
query = f"""
SELECT
magento_ref_code,
product_name,
description_text,
original_price,
sale_price,
master_color,
gender_by_product,
product_web_url,
product_image_url_thumbnail
FROM shared_source.magento_product_dimension_with_text_embedding
WHERE sale_price > 0 AND quantity_sold > 0
ORDER BY rand()
LIMIT {limit}
name,
description,
description_full,
from_date,
to_date,
applied_channel
FROM {PROMOTION_TABLE}
WHERE %s >= DATE(from_date)
AND %s <= DATE(to_date)
ORDER BY
CASE applied_channel
WHEN 'only_online' THEN 0
WHEN 'both' THEN 1
WHEN 'unknown' THEN 2
WHEN 'only_offline' THEN 3
ELSE 4
END,
to_date ASC
LIMIT 20
"""
products = await db.execute_query_async(query)
results = await db.execute_query_async(query, params=(target_date, target_date))
return {
"status": "success",
"total": len(products),
"products": products
"check_date": target_date,
"total": len(results),
"promotions": results,
}
except Exception as e:
logger.error(f"❌ Error in N8N Dedicated Product Fetch API: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"N8N API Error: {e!s}")
logger.error(f"❌ Error in N8N Promotions API: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"N8N Promotions API Error: {e!s}") from e
# =====================================================================
# TOOL 5: STORE SEARCH (Mirror of canifa_store_search tool)
# =====================================================================
@router.get("/api/agent/n8n/stores", summary="N8N: Search Canifa stores by location")
async def n8n_search_stores(
location: str = Query(..., max_length=120, description="Tên quận/huyện/tỉnh/thành phố."),
):
"""
Tìm kiếm cửa hàng CANIFA theo địa điểm/khu vực.
"""
try:
clean = location.lower().strip()
for prefix in ["quận ", "huyện ", "tỉnh ", "thành phố ", "tp. ", "tp "]:
clean = clean.replace(prefix, "")
clean = clean.strip()
if not clean:
raise HTTPException(status_code=400, detail="Location không hợp lệ.")
db = get_db_connection()
query = f"""
SELECT store_name, address, city, state, phone_number,
schedule_name, time_open_today, time_close_today
FROM {STORE_TABLE}
WHERE LOWER(city) LIKE LOWER(%s)
OR LOWER(state) LIKE LOWER(%s)
OR LOWER(address) LIKE LOWER(%s)
OR LOWER(store_name) LIKE LOWER(%s)
ORDER BY state, city, store_name
LIMIT 20
"""
like_pattern = f"%{clean}%"
results = await db.execute_query_async(query, params=(like_pattern, like_pattern, like_pattern, like_pattern))
return {
"status": "success",
"location": location,
"total": len(results),
"stores": results,
}
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error in N8N Store Search API: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"N8N Store Search API Error: {e!s}") from e
# =====================================================================
# TOOL 6: KNOWLEDGE SEARCH (Mirror of canifa_knowledge_search tool)
# =====================================================================
@router.get("/api/agent/n8n/knowledge", summary="N8N: Search brand knowledge (policies, size chart, etc.)")
async def n8n_search_knowledge(
q: str = Query(..., max_length=200, description="Câu hỏi về chính sách, bảng size, KHTT, v.v."),
limit: int = Query(default=5, ge=1, le=20),
):
"""
Tra cứu thông tin thương hiệu Canifa (chính sách đổi trả, bảng size, KHTT, v.v.).
Sử dụng semantic search (vector similarity) trên kho kiến thức.
"""
try:
from common.embedding_service import create_embedding_async
query_vector = await create_embedding_async(q)
if not query_vector:
raise HTTPException(status_code=500, detail="Không tạo được embedding cho câu hỏi.")
import json as _json
v_str = _json.dumps(query_vector)
db = get_db_connection()
query = f"""
SELECT content, metadata
FROM {KNOWLEDGE_TABLE}
ORDER BY approx_cosine_similarity(embedding, {v_str}) DESC
LIMIT %s
"""
results = await db.execute_query_async(query, params=(limit,))
return {
"status": "success",
"query": q,
"total": len(results),
"knowledge": results,
}
except HTTPException:
raise
except Exception as e:
logger.error(f"❌ Error in N8N Knowledge Search API: {e!s}", exc_info=True)
raise HTTPException(status_code=500, detail=f"N8N Knowledge API Error: {e!s}") from e
......@@ -9,7 +9,7 @@ import logging
from langchain_core.language_models import BaseChatModel
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from config import OPENAI_API_KEY
from config import GROQ_API_KEY, OPENAI_API_KEY
logger = logging.getLogger(__name__)
......@@ -54,8 +54,8 @@ class LLMFactory:
logger.debug(f"♻️ Using cached model: {clean_model}")
return self._cache[cache_key]
logger.info(f"Creating new LLM instance: {clean_model}")
return self._create_instance(clean_model, streaming, json_mode, api_key)
logger.info(f"Creating new LLM instance: {model_name}")
return self._create_instance(model_name, streaming, json_mode, api_key)
def _create_instance(
self,
......@@ -77,26 +77,52 @@ class LLMFactory:
raise
def _create_openai(self, model_name: str, streaming: bool, json_mode: bool, api_key: str | None) -> BaseChatModel:
"""Create OpenAI model instance."""
key = api_key or OPENAI_API_KEY
if not key:
raise ValueError("OPENAI_API_KEY is required")
"""Create OpenAI-compatible model instance (OpenAI or Groq)."""
# --- Auto-detect provider ---
is_groq = any(kw in model_name.lower() for kw in ("gpt-oss", "llama", "mixtral", "gemma", "qwen", "deepseek"))
# Also detect openai/ prefix used by Groq (e.g. "openai/gpt-oss-120b")
if model_name.startswith("openai/"):
is_groq = True
if is_groq:
# Always use GROQ_API_KEY for Groq models (ignore api_key param which may be OpenAI key)
key = GROQ_API_KEY
base_url = "https://api.groq.com/openai/v1"
if not key:
raise ValueError("GROQ_API_KEY is required for Groq models")
else:
key = api_key or OPENAI_API_KEY
base_url = None # default OpenAI
if not key:
raise ValueError("OPENAI_API_KEY is required")
# Models that require /v1/responses API instead of /v1/chat/completions
needs_responses_api = "codex" in model_name.lower()
llm_kwargs = {
"model": model_name,
"streaming": streaming, # ← STREAMING CONFIG
"streaming": streaming,
"api_key": key,
"temperature": 0,
"max_tokens": 1500,
}
if base_url:
llm_kwargs["base_url"] = base_url
if needs_responses_api:
llm_kwargs["use_responses_api"] = True
logger.info(f"🔄 Using Responses API for model: {model_name}")
if json_mode:
llm_kwargs["model_kwargs"] = {"response_format": {"type": "json_object"}}
logger.info(f"⚙️ Initializing OpenAI in JSON mode: {model_name}")
provider = "Groq" if is_groq else "OpenAI"
logger.warning(f"🔍 DEBUG: provider={provider} | model={model_name} | base_url={base_url} | key={key[:10]}... | is_groq={is_groq}")
llm = ChatOpenAI(**llm_kwargs)
logger.info(f"✅ Created OpenAI: {model_name} | Streaming: {streaming}")
logger.info(f"✅ Created {provider}: {model_name} | Streaming: {streaming}")
return llm
def _enable_json_mode(self, llm: BaseChatModel, model_name: str) -> BaseChatModel:
......
"""
⚡ FastAPI Bottleneck Middleware
================================
Thêm vào server.py để tự động đo latency từng request.
Dùng:
1. Import vào server.py:
from common.profiler_middleware import ProfilerMiddleware
2. Thêm middleware:
app.add_middleware(ProfilerMiddleware)
3. Xem báo cáo:
GET /debug/profiler/stats
GET /debug/profiler/slow (các request chậm nhất)
GET /debug/profiler/reset
"""
import logging
import time
from collections import deque
from dataclasses import dataclass, field
from statistics import mean, median
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import JSONResponse
logger = logging.getLogger("profiler.middleware")
@dataclass
class RequestMetric:
path: str
method: str
duration: float
status_code: int
timestamp: float
class ProfilerMiddleware(BaseHTTPMiddleware):
"""Middleware đo latency từng request + expose metrics endpoint."""
# Class-level storage (shared across instances)
_metrics: deque = deque(maxlen=1000) # Last 1000 requests
_slow_threshold: float = 5.0 # Seconds
async def dispatch(self, request: Request, call_next):
# Skip profiler endpoints
if request.url.path.startswith("/debug/profiler"):
return await self._handle_profiler_endpoint(request)
start = time.perf_counter()
response = await call_next(request)
duration = time.perf_counter() - start
metric = RequestMetric(
path=request.url.path,
method=request.method,
duration=duration,
status_code=response.status_code,
timestamp=time.time(),
)
self._metrics.append(metric)
# Log slow requests
if duration > self._slow_threshold:
logger.warning(
f"🐌 SLOW REQUEST: {request.method} {request.url.path} "
f"took {duration:.2f}s (>{self._slow_threshold}s)"
)
# Add timing header
response.headers["X-Response-Time"] = f"{duration:.3f}s"
return response
async def _handle_profiler_endpoint(self, request: Request) -> JSONResponse:
path = request.url.path
if path == "/debug/profiler/stats":
return self._get_stats()
elif path == "/debug/profiler/slow":
return self._get_slow_requests()
elif path == "/debug/profiler/reset":
self._metrics.clear()
return JSONResponse({"status": "reset", "message": "Metrics cleared"})
return JSONResponse({"error": "Unknown profiler endpoint"}, status_code=404)
def _get_stats(self) -> JSONResponse:
if not self._metrics:
return JSONResponse({"message": "No data yet"})
metrics = list(self._metrics)
durations = [m.duration for m in metrics]
# Group by path
path_stats = {}
for m in metrics:
key = f"{m.method} {m.path}"
if key not in path_stats:
path_stats[key] = []
path_stats[key].append(m.duration)
path_summary = {}
for path, times in sorted(path_stats.items(), key=lambda x: -max(x[1])):
path_summary[path] = {
"count": len(times),
"avg": round(mean(times), 3),
"median": round(median(times), 3),
"min": round(min(times), 3),
"max": round(max(times), 3),
}
return JSONResponse({
"total_requests": len(metrics),
"overall": {
"avg": round(mean(durations), 3),
"median": round(median(durations), 3),
"min": round(min(durations), 3),
"max": round(max(durations), 3),
},
"by_path": path_summary,
"slow_count": sum(1 for d in durations if d > self._slow_threshold),
})
def _get_slow_requests(self) -> JSONResponse:
slow = [
{
"path": m.path,
"method": m.method,
"duration": round(m.duration, 3),
"status": m.status_code,
"timestamp": m.timestamp,
}
for m in self._metrics
if m.duration > self._slow_threshold
]
slow.sort(key=lambda x: -x["duration"])
return JSONResponse({"threshold": self._slow_threshold, "slow_requests": slow[:50]})
"""
Config file cho Supabase và các environment variables
Lấy giá trị từ file .env qua os.getenv
"""
import os
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
# Export all config variables for type checking
__all__ = [
"AI_MODEL_NAME",
"AI_SUPABASE_KEY",
"AI_SUPABASE_URL",
"CHECKPOINT_POSTGRES_SCHEMA",
"CHECKPOINT_POSTGRES_URL",
"CLERK_SECRET_KEY",
"CONV_DATABASE_URL",
"CONV_SUPABASE_KEY",
"CONV_SUPABASE_URL",
"DEFAULT_MODEL",
"FIRECRAWL_API_KEY",
"GOOGLE_API_KEY",
"GROQ_API_KEY",
"INTERNAL_STOCK_API",
"JWT_ALGORITHM",
"JWT_SECRET",
"LANGFUSE_BASE_URL",
"LANGFUSE_PUBLIC_KEY",
"LANGFUSE_SECRET_KEY",
"LANGSMITH_API_KEY",
"LANGSMITH_ENDPOINT",
"LANGSMITH_PROJECT",
"LANGSMITH_TRACING",
"MONGODB_DB_NAME",
"MONGODB_URI",
"OPENAI_API_KEY",
"OTEL_EXPORTER_JAEGER_AGENT_HOST",
"OTEL_EXPORTER_JAEGER_AGENT_PORT",
"OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES",
"OTEL_SERVICE_NAME",
"OTEL_TRACES_EXPORTER",
"PORT",
"RATE_LIMIT_GUEST",
"RATE_LIMIT_USER",
"REDIS_HOST",
"REDIS_PASSWORD",
"REDIS_PORT",
"REDIS_USERNAME",
"STARROCKS_DB",
"STARROCKS_HOST",
"STARROCKS_PASSWORD",
"STARROCKS_PORT",
"STARROCKS_USER",
"STOCK_API_URL",
"USE_MONGO_CONVERSATION",
]
# ====================== SUPABASE CONFIGURATION ======================
AI_SUPABASE_URL: str | None = os.getenv("AI_SUPABASE_URL")
AI_SUPABASE_KEY: str | None = os.getenv("AI_SUPABASE_KEY")
CONV_SUPABASE_URL: str | None = os.getenv("CONV_SUPABASE_URL")
CONV_SUPABASE_KEY: str | None = os.getenv("CONV_SUPABASE_KEY")
# ====================== REDIS CONFIGURATION ======================
REDIS_HOST: str | None = os.getenv("REDIS_HOST")
REDIS_PORT: int = int(os.getenv("REDIS_PORT", "6379"))
REDIS_PASSWORD: str | None = os.getenv("REDIS_PASSWORD")
REDIS_USERNAME: str | None = os.getenv("REDIS_USERNAME")
# ====================== AI API KEYS & MODELS ======================
OPENAI_API_KEY: str | None = os.getenv("OPENAI_API_KEY")
GOOGLE_API_KEY: str | None = os.getenv("GOOGLE_API_KEY")
GROQ_API_KEY: str | None = os.getenv("GROQ_API_KEY")
DEFAULT_MODEL: str = os.getenv("DEFAULT_MODEL", "gpt-5-nano")
# DEFAULT_MODEL: str = os.getenv("DEFAULT_MODEL")
# ====================== JWT CONFIGURATION ======================
JWT_SECRET: str | None = os.getenv("JWT_SECRET")
JWT_ALGORITHM: str | None = os.getenv("JWT_ALGORITHM")
# ====================== SERVER CONFIG ======================
PORT: int = int(os.getenv("PORT", "5000"))
FIRECRAWL_API_KEY: str | None = os.getenv("FIRECRAWL_API_KEY")
# ====================== LANGFUSE CONFIGURATION (DEPRECATED) ======================
LANGFUSE_SECRET_KEY: str | None = os.getenv("LANGFUSE_SECRET_KEY")
LANGFUSE_PUBLIC_KEY: str | None = os.getenv("LANGFUSE_PUBLIC_KEY")
LANGFUSE_BASE_URL: str | None = os.getenv("LANGFUSE_BASE_URL", "https://cloud.langfuse.com")
# ====================== LANGSMITH CONFIGURATION (TẮT VÌ RATE LIMIT) ======================
# LANGSMITH_TRACING = os.getenv("LANGSMITH_TRACING", "false")
# LANGSMITH_ENDPOINT = os.getenv("LANGSMITH_ENDPOINT", "https://api.smith.langchain.com")
# LANGSMITH_API_KEY = os.getenv("LANGSMITH_API_KEY")
# LANGSMITH_PROJECT = os.getenv("LANGSMITH_PROJECT")
LANGSMITH_TRACING = "false"
LANGSMITH_ENDPOINT = None
LANGSMITH_API_KEY = None
LANGSMITH_PROJECT = None
# ====================== CLERK AUTHENTICATION ======================
CLERK_SECRET_KEY: str | None = os.getenv("CLERK_SECRET_KEY")
# ====================== DATABASE CONNECTION ======================
# Redis Cache Configuration
REDIS_CACHE_URL: str = os.getenv("REDIS_CACHE_URL", "172.16.2.192")
REDIS_CACHE_PORT: int = int(os.getenv("REDIS_CACHE_PORT", "6379"))
REDIS_CACHE_DB: int = int(os.getenv("REDIS_CACHE_DB", "2"))
REDIS_CACHE_TURN_ON: bool = os.getenv("REDIS_CACHE_TURN_ON", "true").lower() == "true"
CONV_DATABASE_URL: str | None = os.getenv("CONV_DATABASE_URL")
# ====================== MONGO CONFIGURATION ======================
MONGODB_URI: str | None = os.getenv("MONGODB_URI", "mongodb://localhost:27017")
MONGODB_DB_NAME: str | None = os.getenv("MONGODB_DB_NAME", "ai_law")
USE_MONGO_CONVERSATION: bool = os.getenv("USE_MONGO_CONVERSATION", "true").lower() == "true"
# ====================== CANIFA INTERNAL POSTGRES ======================
CHECKPOINT_POSTGRES_URL: str | None = os.getenv("CHECKPOINT_POSTGRES_URL")
CHECKPOINT_POSTGRES_SCHEMA: str = os.getenv("CHECKPOINT_POSTGRES_SCHEMA", "canifa_chat")
# ====================== STARROCKS DATA LAKE ======================
STARROCKS_HOST: str | None = os.getenv("STARROCKS_HOST")
STARROCKS_PORT: int = int(os.getenv("STARROCKS_PORT", "9030"))
STARROCKS_USER: str | None = os.getenv("STARROCKS_USER")
STARROCKS_PASSWORD: str | None = os.getenv("STARROCKS_PASSWORD")
STARROCKS_DB: str | None = os.getenv("STARROCKS_DB")
# Placeholder for backward compatibility if needed
AI_MODEL_NAME = DEFAULT_MODEL
# ====================== OPENTELEMETRY CONFIGURATION ======================
OTEL_EXPORTER_JAEGER_AGENT_HOST = os.getenv("OTEL_EXPORTER_JAEGER_AGENT_HOST")
OTEL_EXPORTER_JAEGER_AGENT_PORT = os.getenv("OTEL_EXPORTER_JAEGER_AGENT_PORT")
OTEL_SERVICE_NAME = os.getenv("OTEL_SERVICE_NAME")
OTEL_TRACES_EXPORTER = os.getenv("OTEL_TRACES_EXPORTER")
OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES = os.getenv("OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES")
RATE_LIMIT_GUEST: int = int(os.getenv("RATE_LIMIT_GUEST", "10"))
RATE_LIMIT_USER: int = int(os.getenv("RATE_LIMIT_USER", "100"))
# ====================== STOCK API CONFIGURATION ======================
# External Canifa Stock API (dùng trực tiếp nếu cần)
STOCK_API_URL: str = os.getenv("STOCK_API_URL", "https://canifa.com/v1/middleware/stock_get_stock_list")
# Internal Stock API (có logic expand SKU từ base code)
INTERNAL_STOCK_API: str = os.getenv("INTERNAL_STOCK_API", "http://localhost:5000/api/stock/check")
"""
Config file cho Supabase và các environment variables
Lấy giá trị từ file .env qua os.getenv
"""
import os
from dotenv import load_dotenv
# Load environment variables from .env file
load_dotenv()
# Export all config variables for type checking
__all__ = [
"AI_MODEL_NAME",
"AI_SUPABASE_KEY",
"AI_SUPABASE_URL",
"CHECKPOINT_POSTGRES_SCHEMA",
"CHECKPOINT_POSTGRES_URL",
"CLERK_SECRET_KEY",
"CONV_DATABASE_URL",
"CONV_SUPABASE_KEY",
"CONV_SUPABASE_URL",
"DEFAULT_MODEL",
"FIRECRAWL_API_KEY",
"GOOGLE_API_KEY",
"GROQ_API_KEY",
"INTERNAL_STOCK_API",
"JWT_ALGORITHM",
"JWT_SECRET",
"LANGFUSE_BASE_URL",
"LANGFUSE_PUBLIC_KEY",
"LANGFUSE_SECRET_KEY",
"LANGSMITH_API_KEY",
"LANGSMITH_ENDPOINT",
"LANGSMITH_PROJECT",
"LANGSMITH_TRACING",
"MONGODB_DB_NAME",
"MONGODB_URI",
"OPENAI_API_KEY",
"OTEL_EXPORTER_JAEGER_AGENT_HOST",
"OTEL_EXPORTER_JAEGER_AGENT_PORT",
"OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES",
"OTEL_SERVICE_NAME",
"OTEL_TRACES_EXPORTER",
"PORT",
"RATE_LIMIT_GUEST",
"RATE_LIMIT_USER",
"REDIS_HOST",
"REDIS_PASSWORD",
"REDIS_PORT",
"REDIS_USERNAME",
"STARROCKS_DB",
"STARROCKS_HOST",
"STARROCKS_PASSWORD",
"STARROCKS_PORT",
"STARROCKS_USER",
"STOCK_API_URL",
"USE_MONGO_CONVERSATION",
]
# ====================== SUPABASE CONFIGURATION ======================
AI_SUPABASE_URL: str | None = os.getenv("AI_SUPABASE_URL")
AI_SUPABASE_KEY: str | None = os.getenv("AI_SUPABASE_KEY")
CONV_SUPABASE_URL: str | None = os.getenv("CONV_SUPABASE_URL")
CONV_SUPABASE_KEY: str | None = os.getenv("CONV_SUPABASE_KEY")
# ====================== REDIS CONFIGURATION ======================
REDIS_HOST: str | None = os.getenv("REDIS_HOST")
REDIS_PORT: int = int(os.getenv("REDIS_PORT", "6379"))
REDIS_PASSWORD: str | None = os.getenv("REDIS_PASSWORD")
REDIS_USERNAME: str | None = os.getenv("REDIS_USERNAME")
# ====================== AI API KEYS & MODELS ======================
OPENAI_API_KEY: str | None = os.getenv("OPENAI_API_KEY")
GOOGLE_API_KEY: str | None = os.getenv("GOOGLE_API_KEY")
GROQ_API_KEY: str | None = os.getenv("GROQ_API_KEY")
DEFAULT_MODEL: str = os.getenv("DEFAULT_MODEL")
# DEFAULT_MODEL: str = os.getenv("DEFAULT_MODEL")
# ====================== JWT CONFIGURATION ======================
JWT_SECRET: str | None = os.getenv("JWT_SECRET")
JWT_ALGORITHM: str | None = os.getenv("JWT_ALGORITHM")
# ====================== SERVER CONFIG ======================
PORT: int = int(os.getenv("PORT", "5000"))
FIRECRAWL_API_KEY: str | None = os.getenv("FIRECRAWL_API_KEY")
# ====================== LANGFUSE CONFIGURATION (DEPRECATED) ======================
LANGFUSE_SECRET_KEY: str | None = os.getenv("LANGFUSE_SECRET_KEY")
LANGFUSE_PUBLIC_KEY: str | None = os.getenv("LANGFUSE_PUBLIC_KEY")
LANGFUSE_BASE_URL: str | None = os.getenv("LANGFUSE_BASE_URL", "https://cloud.langfuse.com")
# ====================== LANGSMITH CONFIGURATION (TẮT VÌ RATE LIMIT) ======================
# LANGSMITH_TRACING = os.getenv("LANGSMITH_TRACING", "false")
# LANGSMITH_ENDPOINT = os.getenv("LANGSMITH_ENDPOINT", "https://api.smith.langchain.com")
# LANGSMITH_API_KEY = os.getenv("LANGSMITH_API_KEY")
# LANGSMITH_PROJECT = os.getenv("LANGSMITH_PROJECT")
LANGSMITH_TRACING = "false"
LANGSMITH_ENDPOINT = None
LANGSMITH_API_KEY = None
LANGSMITH_PROJECT = None
# ====================== CLERK AUTHENTICATION ======================
CLERK_SECRET_KEY: str | None = os.getenv("CLERK_SECRET_KEY")
# ====================== DATABASE CONNECTION ======================
# Redis Cache Configuration
REDIS_CACHE_URL: str = os.getenv("REDIS_CACHE_URL", "172.16.2.192")
REDIS_CACHE_PORT: int = int(os.getenv("REDIS_CACHE_PORT", "6379"))
REDIS_CACHE_DB: int = int(os.getenv("REDIS_CACHE_DB", "2"))
REDIS_CACHE_TURN_ON: bool = os.getenv("REDIS_CACHE_TURN_ON", "true").lower() == "true"
CONV_DATABASE_URL: str | None = os.getenv("CONV_DATABASE_URL")
# ====================== MONGO CONFIGURATION ======================
MONGODB_URI: str | None = os.getenv("MONGODB_URI", "mongodb://localhost:27017")
MONGODB_DB_NAME: str | None = os.getenv("MONGODB_DB_NAME", "ai_law")
USE_MONGO_CONVERSATION: bool = os.getenv("USE_MONGO_CONVERSATION", "true").lower() == "true"
# ====================== CANIFA INTERNAL POSTGRES ======================
CHECKPOINT_POSTGRES_URL: str | None = os.getenv("CHECKPOINT_POSTGRES_URL")
CHECKPOINT_POSTGRES_SCHEMA: str = os.getenv("CHECKPOINT_POSTGRES_SCHEMA", "canifa_chat")
# ====================== STARROCKS DATA LAKE ======================
STARROCKS_HOST: str | None = os.getenv("STARROCKS_HOST")
STARROCKS_PORT: int = int(os.getenv("STARROCKS_PORT", "9030"))
STARROCKS_USER: str | None = os.getenv("STARROCKS_USER")
STARROCKS_PASSWORD: str | None = os.getenv("STARROCKS_PASSWORD")
STARROCKS_DB: str | None = os.getenv("STARROCKS_DB")
# Placeholder for backward compatibility if needed
AI_MODEL_NAME = DEFAULT_MODEL
# ====================== OPENTELEMETRY CONFIGURATION ======================
OTEL_EXPORTER_JAEGER_AGENT_HOST = os.getenv("OTEL_EXPORTER_JAEGER_AGENT_HOST")
OTEL_EXPORTER_JAEGER_AGENT_PORT = os.getenv("OTEL_EXPORTER_JAEGER_AGENT_PORT")
OTEL_SERVICE_NAME = os.getenv("OTEL_SERVICE_NAME")
OTEL_TRACES_EXPORTER = os.getenv("OTEL_TRACES_EXPORTER")
OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES = os.getenv("OTEL_EXPORTER_JAEGER_AGENT_SPLIT_OVERSIZED_BATCHES")
RATE_LIMIT_GUEST: int = int(os.getenv("RATE_LIMIT_GUEST", "10"))
RATE_LIMIT_USER: int = int(os.getenv("RATE_LIMIT_USER", "100"))
# ====================== STOCK API CONFIGURATION ======================
# External Canifa Stock API (dùng trực tiếp nếu cần)
STOCK_API_URL: str = os.getenv("STOCK_API_URL", "https://canifa.com/v1/middleware/stock_get_stock_list")
# Internal Stock API (có logic expand SKU từ base code)
INTERNAL_STOCK_API: str = os.getenv("INTERNAL_STOCK_API", "http://localhost:5000/api/stock/check")
services:
# --- n8n Workflow Automation ---
n8n:
image: docker.n8n.io/n8nio/n8n:latest
container_name: canifa_n8n
ports:
- "5678:5678"
environment:
- N8N_HOST=0.0.0.0
- N8N_PORT=5678
- N8N_PROTOCOL=http
- WEBHOOK_URL=http://localhost:5678/
- GENERIC_TIMEZONE=Asia/Ho_Chi_Minh
- TZ=Asia/Ho_Chi_Minh
# Basic auth - đổi password trước khi dùng production nhé bro
- N8N_BASIC_AUTH_ACTIVE=true
- N8N_BASIC_AUTH_USER=admin
- N8N_BASIC_AUTH_PASSWORD=canifa2026
volumes:
- n8n_data:/home/node/.n8n
restart: unless-stopped
networks:
- backend_network
networks:
backend_network:
driver: bridge
ipam:
driver: default
config:
- subnet: "172.24.0.0/16"
gateway: "172.24.0.1"
volumes:
n8n_data:
driver: local
.\.venv\Scripts\activate
uvicorn server:app --host 0.0.0.0 --port 5000 --reload
uvicorn server:app --host 0.0.0.0 --port 5000
docker restart chatbot-backend
docker restart chatbot-backend && docker logs -f chatbot-backend
docker logs -f chatbot-backend
docker restart canifa_backend
sudo docker compose -f docker-compose.prod.yml up -d --build
Get-NetTCPConnection -LocalPort 5000 | ForEach-Object { Stop-Process -Id $_.OwningProcess -Force }
taskkill /F /IM python.exe
netstat -ano | findstr :5000 | ForEach-Object { $_.Split()[-1] } | Sort-Object -Unique | ForEach-Object { taskkill /F /PID $_ }
\ No newline at end of file
.\.venv\Scripts\activate
uvicorn server:app --host 0.0.0.0 --port 5000 --reload
uvicorn server:app --host 0.0.0.0 --port 5000
docker restart chatbot-backend
docker restart chatbot-backend && docker logs -f chatbot-backend
docker logs -f chatbot-backend
docker restart canifa_backend
sudo docker compose -f docker-compose.prod.yml up -d --build
Get-NetTCPConnection -LocalPort 5000 | ForEach-Object { Stop-Process -Id $_.OwningProcess -Force }
taskkill /F /IM python.exe
netstat -ano | findstr :5000 | ForEach-Object { $_.Split()[-1] } | Sort-Object -Unique | ForEach-Object { taskkill /F /PID $_ }
\ No newline at end of file
# TEST STREAMING + BACKGROUND USER_INSIGHT
Write-Host "`n==== STREAMING TEST ====`n" -ForegroundColor Cyan
$query = "Ao khoac nam mua dong"
$deviceId = "test_stream_verify"
Write-Host "Sending request..." -ForegroundColor Green
$timing = Measure-Command {
$body = '{"user_query":"' + $query + '","device_id":"' + $deviceId + '"}'
$result = $body | curl.exe -s -X POST "http://localhost:5000/api/agent/chat" -H "Content-Type: application/json" --data-binary "@-"
$result | Out-Null
}
Write-Host "`nResponse Time: $($timing.TotalMilliseconds) ms" -ForegroundColor Green
Write-Host "`nCheck backend logs for:" -ForegroundColor Yellow
Write-Host " - Starting LLM streaming" -ForegroundColor Gray
Write-Host " - Regex matched product_ids" -ForegroundColor Gray
Write-Host " - BREAKING STREAM NOW" -ForegroundColor Gray
Write-Host " - Background task extraction" -ForegroundColor Gray
Write-Host "`nDone!" -ForegroundColor Green
"""
Canifa Chatbot Automated Testing Script
========================================
Tự động test chatbot bằng cách:
1. Tạo Google Sheet (nếu chưa có)
2. Ghi danh sách câu hỏi test lên sheet
3. Call API chatbot cho từng câu hỏi
4. Ghi câu trả lời vào sheet
Usage:
python backend/tests/auto_test_chatbot.py
python backend/tests/auto_test_chatbot.py --api-url http://172.16.2.207:5000
"""
from __future__ import annotations
import argparse
import json
import logging
import sys
import time
from datetime import datetime
from pathlib import Path
import gspread
import requests
from google.oauth2.service_account import Credentials
# ==============================================================================
# CONFIGURATION
# ==============================================================================
# Google Sheets
CREDENTIALS_FILE = Path(__file__).parent / "google_sheets_credentials.json"
SHEET_NAME = "Canifa Chatbot Test Results"
SCOPES = [
"https://www.googleapis.com/auth/spreadsheets",
"https://www.googleapis.com/auth/drive",
]
# Service account email - share sheet với email này
SERVICE_ACCOUNT_EMAIL = "canifa-chatbot-test@rapid-potential-480209-q7.iam.gserviceaccount.com"
# Chatbot API
DEFAULT_API_URL = "http://localhost:5000"
CHAT_ENDPOINT = "/api/agent/chat-dev"
REQUEST_TIMEOUT = 120 # seconds
# Test questions - Các câu hỏi để test chatbot
TEST_QUESTIONS = [
# Tìm kiếm sản phẩm cơ bản
"Tìm cho mình chân váy màu đỏ",
"Tìm quần màu đỏ",
"Tìm áo polo nam",
"Tìm áo khoác nữ mùa đông",
# Tư vấn thời trang
"Mình muốn mua đồ đi biển, gợi ý cho mình",
"Cho mình xem áo sơ mi đi làm",
"Gợi ý outfit đi dự tiệc",
# Các tình huống đặc biệt
"Áo size S giá dưới 500k",
"Có khuyến mãi gì không?",
"Cách đặt hàng online",
"Cửa hàng nào gần nhất ở Hà Nội",
# Edge cases
"Xin chào",
"Cảm ơn bạn",
"Tìm sản phẩm abc123 không tồn tại",
]
# ==============================================================================
# LOGGING
# ==============================================================================
logging.basicConfig(
level=logging.INFO,
format="%(asctime)s | %(levelname)-7s | %(message)s",
datefmt="%H:%M:%S",
)
logger = logging.getLogger(__name__)
# ==============================================================================
# GOOGLE SHEETS FUNCTIONS
# ==============================================================================
def get_gspread_client() -> gspread.Client:
"""Khởi tạo gspread client với service account credentials."""
if not CREDENTIALS_FILE.exists():
logger.error(f"❌ Không tìm thấy file credentials: {CREDENTIALS_FILE}")
logger.info(f" Hãy download JSON key từ Google Cloud Console")
logger.info(f" và đặt tại: {CREDENTIALS_FILE}")
sys.exit(1)
creds = Credentials.from_service_account_file(str(CREDENTIALS_FILE), scopes=SCOPES)
return gspread.authorize(creds)
def create_or_get_sheet(client: gspread.Client) -> gspread.Spreadsheet:
"""Tạo mới hoặc mở Google Sheet đã tồn tại."""
try:
# Thử mở sheet đã có
spreadsheet = client.open(SHEET_NAME)
logger.info(f"📄 Đã tìm thấy sheet: {SHEET_NAME}")
return spreadsheet
except gspread.SpreadsheetNotFound:
# Tạo mới
logger.info(f"📝 Tạo Google Sheet mới: {SHEET_NAME}")
spreadsheet = client.create(SHEET_NAME)
# Share sheet cho owner account
spreadsheet.share("anhvuhoang2k2@gmail.com", perm_type="user", role="writer")
logger.info(f"✅ Đã share sheet cho anhvuhoang2k2@gmail.com")
return spreadsheet
def setup_worksheet(spreadsheet: gspread.Spreadsheet, test_week: str) -> gspread.Worksheet:
"""Setup worksheet với headers và câu hỏi test."""
# Tạo hoặc lấy worksheet cho tuần test này
worksheet_title = test_week
try:
worksheet = spreadsheet.worksheet(worksheet_title)
logger.info(f"📋 Đã tìm thấy worksheet: {worksheet_title}")
except gspread.WorksheetNotFound:
worksheet = spreadsheet.add_worksheet(title=worksheet_title, rows=100, cols=6)
logger.info(f"📋 Tạo worksheet mới: {worksheet_title}")
# Setup headers
headers = [
"STT",
"Câu hỏi test",
"Tuần test & Lần test",
"Câu trả lời của chatbot",
"Thời gian response (ms)",
"Trạng thái",
]
# Check nếu headers đã có
existing = worksheet.row_values(1)
if not existing or existing[0] != "STT":
worksheet.update("A1:F1", [headers])
# Format header row - bold
worksheet.format("A1:F1", {
"textFormat": {"bold": True},
"backgroundColor": {"red": 0.2, "green": 0.6, "blue": 0.9},
"horizontalAlignment": "CENTER",
})
logger.info("✅ Đã setup headers")
return worksheet
def fill_questions(worksheet: gspread.Worksheet, questions: list[str], test_label: str) -> int:
"""Ghi danh sách câu hỏi vào sheet. Returns starting row."""
# Tìm row trống tiếp theo
all_values = worksheet.get_all_values()
start_row = len(all_values) + 1
# Chuẩn bị data
rows_data = []
for i, question in enumerate(questions, 1):
rows_data.append([
start_row - 1 + i - 1, # STT
question, # Câu hỏi
test_label, # Tuần test & Lần test
"", # Câu trả lời (sẽ fill sau)
"", # Thời gian
"⏳ Đang chờ...", # Trạng thái
])
# Batch update
cell_range = f"A{start_row}:F{start_row + len(questions) - 1}"
worksheet.update(cell_range, rows_data)
logger.info(f"✅ Đã ghi {len(questions)} câu hỏi (từ row {start_row})")
return start_row
# ==============================================================================
# CHATBOT API FUNCTIONS
# ==============================================================================
def call_chatbot(api_url: str, question: str) -> dict:
"""
Call chatbot API và trả về response.
Returns:
dict với keys: ai_response, response_time_ms, status, error
"""
url = f"{api_url}{CHAT_ENDPOINT}"
payload = {
"user_query": question,
"images": [],
}
try:
start_time = time.time()
response = requests.post(
url,
json=payload,
headers={"Content-Type": "application/json"},
timeout=REQUEST_TIMEOUT,
)
elapsed_ms = int((time.time() - start_time) * 1000)
if response.status_code == 200:
data = response.json()
ai_response = data.get("ai_response", "")
return {
"ai_response": ai_response,
"response_time_ms": elapsed_ms,
"status": "✅ OK",
"error": None,
}
else:
return {
"ai_response": f"HTTP {response.status_code}: {response.text[:200]}",
"response_time_ms": elapsed_ms,
"status": f"❌ HTTP {response.status_code}",
"error": response.text[:200],
}
except requests.exceptions.Timeout:
return {
"ai_response": "TIMEOUT - Quá thời gian chờ",
"response_time_ms": REQUEST_TIMEOUT * 1000,
"status": "⏰ Timeout",
"error": "Request timeout",
}
except requests.exceptions.ConnectionError:
return {
"ai_response": f"CONNECTION ERROR - Không thể kết nối tới {url}",
"response_time_ms": 0,
"status": "🔴 Connection Error",
"error": f"Cannot connect to {url}",
}
except Exception as e:
return {
"ai_response": f"ERROR: {str(e)[:200]}",
"response_time_ms": 0,
"status": f"❌ Error",
"error": str(e)[:200],
}
# ==============================================================================
# MAIN TEST RUNNER
# ==============================================================================
def run_tests(api_url: str, test_week: str, test_round: int) -> None:
"""Chạy toàn bộ test flow."""
test_label = f"{test_week} - Lần {test_round}"
logger.info("=" * 60)
logger.info(f"🚀 BẮT ĐẦU TEST: {test_label}")
logger.info(f" API URL: {api_url}")
logger.info(f" Số câu hỏi: {len(TEST_QUESTIONS)}")
logger.info("=" * 60)
# Step 1: Setup Google Sheets
logger.info("\n📊 [Step 1] Khởi tạo Google Sheets...")
client = get_gspread_client()
spreadsheet = create_or_get_sheet(client)
worksheet = setup_worksheet(spreadsheet, test_week)
# Step 2: Fill câu hỏi
logger.info("\n📝 [Step 2] Ghi câu hỏi test...")
start_row = fill_questions(worksheet, TEST_QUESTIONS, test_label)
# Step 3: Call chatbot API & update kết quả
logger.info(f"\n🤖 [Step 3] Bắt đầu test {len(TEST_QUESTIONS)} câu hỏi...\n")
success_count = 0
total_time = 0
for i, question in enumerate(TEST_QUESTIONS):
row_idx = start_row + i
logger.info(f" [{i+1}/{len(TEST_QUESTIONS)}] Testing: '{question[:50]}...'")
# Call API
result = call_chatbot(api_url, question)
# Truncate response cho sheet (max 50000 chars)
ai_response = result["ai_response"]
if len(ai_response) > 5000:
ai_response = ai_response[:5000] + "... (truncated)"
# Update sheet
worksheet.update(f"D{row_idx}", [[ai_response]])
worksheet.update(f"E{row_idx}", [[result["response_time_ms"]]])
worksheet.update(f"F{row_idx}", [[result["status"]]])
# Stats
if "OK" in result["status"]:
success_count += 1
total_time += result["response_time_ms"]
logger.info(f" → {result['status']} ({result['response_time_ms']}ms)")
# Rate limit: đợi 1 giây giữa các request
if i < len(TEST_QUESTIONS) - 1:
time.sleep(1)
# Step 4: Summary
logger.info("\n" + "=" * 60)
logger.info("📊 KẾT QUẢ TEST:")
logger.info(f" ✅ Thành công: {success_count}/{len(TEST_QUESTIONS)}")
logger.info(f" ❌ Thất bại: {len(TEST_QUESTIONS) - success_count}/{len(TEST_QUESTIONS)}")
logger.info(f" ⏱️ Tổng thời gian: {total_time}ms")
logger.info(f" 📈 Trung bình: {total_time // len(TEST_QUESTIONS)}ms/câu")
logger.info(f"\n 📄 Sheet URL: {spreadsheet.url}")
logger.info("=" * 60)
# ==============================================================================
# CLI ENTRY POINT
# ==============================================================================
def main():
parser = argparse.ArgumentParser(description="Canifa Chatbot Automated Testing")
parser.add_argument(
"--api-url",
default=DEFAULT_API_URL,
help=f"Base URL of chatbot API (default: {DEFAULT_API_URL})",
)
parser.add_argument(
"--week",
default=None,
help="Test week label (default: auto-calculated from current date)",
)
parser.add_argument(
"--round",
type=int,
default=1,
help="Test round number (default: 1)",
)
args = parser.parse_args()
# Auto-calculate test week
if args.week is None:
now = datetime.now()
week_num = now.isocalendar()[1]
args.week = f"Tuần {week_num} ({now.strftime('%m/%Y')})"
run_tests(
api_url=args.api_url,
test_week=args.week,
test_round=args.round,
)
if __name__ == "__main__":
main()
"""
🔬 CANIFA Chatbot - Bottleneck Profiler
========================================
Script đo bottleneck end-to-end cho chatbot API.
Đo:
1. API response time (tổng)
2. Time-to-first-token (TTFT) cho streaming
3. Tool execution latency
4. Memory & CPU usage
5. Async event loop lag
Dùng:
python tests/profiler_bottleneck.py # Chạy mặc định
python tests/profiler_bottleneck.py --queries 5 --url http://localhost:8001
python tests/profiler_bottleneck.py --profile cprofile # Deep profile với cProfile
python tests/profiler_bottleneck.py --profile pyinstrument # Flame graph
"""
import argparse
import asyncio
import json
import logging
import os
import platform
import statistics
import sys
import time
from dataclasses import dataclass, field
from pathlib import Path
# Add project root
sys.path.insert(0, str(Path(__file__).parent.parent))
try:
import httpx
except ImportError:
print("❌ Cần cài httpx: pip install httpx")
sys.exit(1)
try:
import psutil
except ImportError:
psutil = None
print("⚠️ psutil không có, bỏ qua CPU/Memory monitoring")
# =====================================================================
# CONFIG
# =====================================================================
DEFAULT_URL = "http://localhost:5000"
DEFAULT_QUERIES = [
"Có áo polo nam màu xanh navy không?",
"Tìm váy liền cho bé gái 5 tuổi",
"Gợi ý outfit đi biển mùa hè",
"Áo khoác gió nam size L giá dưới 500k",
"Quần jeans nữ ống rộng màu đen",
]
logging.basicConfig(level=logging.INFO, format="%(message)s")
logger = logging.getLogger("profiler")
# =====================================================================
# DATA MODELS
# =====================================================================
@dataclass
class QueryResult:
query: str
total_time: float = 0.0
ttft: float = 0.0 # Time to first token
token_count: int = 0
status_code: int = 0
error: str = ""
has_products: bool = False
memory_before_mb: float = 0.0
memory_after_mb: float = 0.0
cpu_percent: float = 0.0
chunks_received: int = 0
@dataclass
class ProfileReport:
results: list[QueryResult] = field(default_factory=list)
total_duration: float = 0.0
system_info: dict = field(default_factory=dict)
@property
def avg_response_time(self) -> float:
times = [r.total_time for r in self.results if r.status_code == 200]
return statistics.mean(times) if times else 0
@property
def p50_response_time(self) -> float:
times = sorted([r.total_time for r in self.results if r.status_code == 200])
return times[len(times) // 2] if times else 0
@property
def p95_response_time(self) -> float:
times = sorted([r.total_time for r in self.results if r.status_code == 200])
idx = int(len(times) * 0.95)
return times[min(idx, len(times) - 1)] if times else 0
@property
def avg_ttft(self) -> float:
times = [r.ttft for r in self.results if r.ttft > 0]
return statistics.mean(times) if times else 0
# =====================================================================
# PROFILER CORE
# =====================================================================
class BottleneckProfiler:
"""Profiler cho CANIFA Chatbot API."""
def __init__(self, base_url: str, conversation_id: str = "profiler_test"):
self.base_url = base_url.rstrip("/")
self.conversation_id = conversation_id
self.process = psutil.Process() if psutil else None
def _get_memory_mb(self) -> float:
if self.process:
return self.process.memory_info().rss / (1024 * 1024)
return 0
def _get_cpu_percent(self) -> float:
if self.process:
return self.process.cpu_percent(interval=0.1)
return 0
async def profile_streaming_query(self, query: str, client: httpx.AsyncClient) -> QueryResult:
"""Profile một query với streaming response."""
result = QueryResult(query=query)
result.memory_before_mb = self._get_memory_mb()
payload = {
"user_query": query,
"user_id": "profiler_bot",
}
start = time.perf_counter()
first_token_time = None
try:
async with client.stream(
"POST",
f"{self.base_url}/api/agent/chat-dev",
json=payload,
timeout=120.0,
) as response:
result.status_code = response.status_code
if response.status_code != 200:
result.error = f"HTTP {response.status_code}"
result.total_time = time.perf_counter() - start
return result
full_content = ""
async for chunk in response.aiter_text():
if first_token_time is None and chunk.strip():
first_token_time = time.perf_counter()
result.ttft = first_token_time - start
result.chunks_received += 1
full_content += chunk
result.total_time = time.perf_counter() - start
# Check for product IDs in response
result.has_products = "product_ids" in full_content.lower() or "sku" in full_content.lower()
result.token_count = len(full_content.split())
except httpx.TimeoutException:
result.error = "TIMEOUT (>120s)"
result.total_time = time.perf_counter() - start
except Exception as e:
result.error = str(e)
result.total_time = time.perf_counter() - start
result.memory_after_mb = self._get_memory_mb()
result.cpu_percent = self._get_cpu_percent()
return result
async def profile_non_streaming_query(self, query: str, client: httpx.AsyncClient) -> QueryResult:
"""Profile một query với non-streaming response (fallback)."""
result = QueryResult(query=query)
result.memory_before_mb = self._get_memory_mb()
payload = {
"user_query": query,
"user_id": "profiler_bot",
}
start = time.perf_counter()
try:
response = await client.post(
f"{self.base_url}/api/agent/chat-dev",
json=payload,
timeout=120.0,
)
result.total_time = time.perf_counter() - start
result.status_code = response.status_code
result.ttft = result.total_time # Non-streaming: TTFT = total
if response.status_code == 200:
body = response.text
result.has_products = "product_ids" in body.lower() or "sku" in body.lower()
result.token_count = len(body.split())
except httpx.TimeoutException:
result.error = "TIMEOUT (>120s)"
result.total_time = time.perf_counter() - start
except Exception as e:
result.error = str(e)
result.total_time = time.perf_counter() - start
result.memory_after_mb = self._get_memory_mb()
result.cpu_percent = self._get_cpu_percent()
return result
async def run(self, queries: list[str], use_streaming: bool = True) -> ProfileReport:
"""Chạy profiling cho danh sách queries."""
report = ProfileReport()
report.system_info = self._collect_system_info()
overall_start = time.perf_counter()
async with httpx.AsyncClient() as client:
# Health check
try:
health = await client.get(f"{self.base_url}/health", timeout=5.0)
if health.status_code != 200:
logger.error(f"❌ Server không healthy: {health.status_code}")
return report
logger.info(f"✅ Server healthy: {self.base_url}")
except Exception as e:
logger.error(f"❌ Không kết nối được server: {e}")
return report
# Run queries sequentially
for i, query in enumerate(queries, 1):
logger.info(f"\n{'='*60}")
logger.info(f"📝 Query {i}/{len(queries)}: {query}")
logger.info(f"{'='*60}")
if use_streaming:
result = await self.profile_streaming_query(query, client)
else:
result = await self.profile_non_streaming_query(query, client)
report.results.append(result)
# Log kết quả ngay
status = "✅" if not result.error else "❌"
logger.info(f"{status} Total: {result.total_time:.2f}s | "
f"TTFT: {result.ttft:.2f}s | "
f"Chunks: {result.chunks_received} | "
f"Tokens: ~{result.token_count}")
if result.error:
logger.info(f" ⚠️ Error: {result.error}")
if psutil:
mem_delta = result.memory_after_mb - result.memory_before_mb
logger.info(f" 💾 Memory: {result.memory_after_mb:.1f}MB "
f"(Δ{mem_delta:+.1f}MB) | CPU: {result.cpu_percent:.1f}%")
# Nghỉ giữa các queries để không quá tải
if i < len(queries):
await asyncio.sleep(2)
report.total_duration = time.perf_counter() - overall_start
return report
def _collect_system_info(self) -> dict:
info = {
"platform": platform.platform(),
"python": platform.python_version(),
"cpu_count": os.cpu_count(),
}
if psutil:
vm = psutil.virtual_memory()
info["total_ram_gb"] = round(vm.total / (1024**3), 1)
info["available_ram_gb"] = round(vm.available / (1024**3), 1)
return info
# =====================================================================
# EVENT LOOP LAG DETECTOR
# =====================================================================
class EventLoopLagMonitor:
"""Đo async event loop lag - phát hiện blocking code."""
def __init__(self, threshold_ms: float = 100):
self.threshold_ms = threshold_ms
self.lags: list[float] = []
self._running = False
async def start(self):
self._running = True
while self._running:
t1 = time.perf_counter()
await asyncio.sleep(0.01) # 10ms expected
t2 = time.perf_counter()
lag_ms = (t2 - t1 - 0.01) * 1000
if lag_ms > self.threshold_ms:
self.lags.append(lag_ms)
logger.warning(f"⚡ Event loop lag: {lag_ms:.0f}ms (threshold: {self.threshold_ms}ms)")
def stop(self):
self._running = False
@property
def summary(self) -> str:
if not self.lags:
return "✅ Không phát hiện event loop lag"
return (
f"⚠️ {len(self.lags)} lần lag > {self.threshold_ms}ms | "
f"Max: {max(self.lags):.0f}ms | Avg: {statistics.mean(self.lags):.0f}ms"
)
# =====================================================================
# REPORT GENERATOR
# =====================================================================
def print_report(report: ProfileReport):
"""In báo cáo bottleneck đẹp."""
print("\n" + "=" * 70)
print("🔬 BÁO CÁO BOTTLENECK PROFILING - CANIFA CHATBOT")
print("=" * 70)
# System info
si = report.system_info
print(f"\n📦 System: {si.get('platform', 'N/A')}")
print(f" Python: {si.get('python', 'N/A')} | CPUs: {si.get('cpu_count', 'N/A')}")
if "total_ram_gb" in si:
print(f" RAM: {si['available_ram_gb']}GB free / {si['total_ram_gb']}GB total")
# Per-query results
print(f"\n{'─'*70}")
print(f"{'Query':<35} {'Total':>8} {'TTFT':>8} {'Chunks':>8} {'Status':>8}")
print(f"{'─'*70}")
for r in report.results:
status = "✅" if not r.error else "❌"
q = r.query[:33] + ".." if len(r.query) > 33 else r.query
print(f"{q:<35} {r.total_time:>7.2f}s {r.ttft:>7.2f}s {r.chunks_received:>7} {status:>8}")
# Summary stats
print(f"\n{'='*70}")
print("📊 TỔNG KẾT")
print(f"{'='*70}")
successful = [r for r in report.results if r.status_code == 200]
failed = [r for r in report.results if r.error]
print(f"\n Queries: {len(report.results)} total | {len(successful)} thành công | {len(failed)} lỗi")
print(f" Tổng thời gian: {report.total_duration:.1f}s")
if successful:
times = [r.total_time for r in successful]
print(f"\n ⏱️ Response Time:")
print(f" Average: {report.avg_response_time:.2f}s")
print(f" P50: {report.p50_response_time:.2f}s")
print(f" P95: {report.p95_response_time:.2f}s")
print(f" Min: {min(times):.2f}s")
print(f" Max: {max(times):.2f}s")
ttfts = [r.ttft for r in successful if r.ttft > 0]
if ttfts:
print(f"\n 🚀 Time-to-First-Token (TTFT):")
print(f" Average: {report.avg_ttft:.2f}s")
print(f" Min: {min(ttfts):.2f}s")
print(f" Max: {max(ttfts):.2f}s")
if psutil:
mems = [r.memory_after_mb for r in successful]
cpus = [r.cpu_percent for r in successful if r.cpu_percent > 0]
print(f"\n 💾 Resource Usage:")
print(f" Memory: {min(mems):.0f} - {max(mems):.0f} MB")
if cpus:
print(f" CPU: {min(cpus):.0f} - {max(cpus):.0f}%")
# Bottleneck analysis
print(f"\n{'='*70}")
print("🎯 PHÂN TÍCH BOTTLENECK")
print(f"{'='*70}")
if report.avg_response_time > 15:
print("\n 🔴 CRITICAL: Response time > 15s")
print(" → Kiểm tra: LLM model latency, tool execution, network")
elif report.avg_response_time > 10:
print("\n 🟡 WARNING: Response time 10-15s")
print(" → Bình thường cho multi-tool + LLM generation")
print(" → Tối ưu: cache warming, embedding local, smaller model")
elif report.avg_response_time > 5:
print("\n 🟢 GOOD: Response time 5-10s")
print(" → Tốt cho agentic flow có tool calls")
else:
print("\n ⭐ EXCELLENT: Response time < 5s")
if report.avg_ttft > 8:
print("\n 🔴 TTFT > 8s: User phải đợi quá lâu")
print(" → Xem xét: streaming optimization, early return pattern")
elif report.avg_ttft > 4:
print("\n 🟡 TTFT 4-8s: Chấp nhận được nhưng có thể cải thiện")
else:
print("\n 🟢 TTFT < 4s: Tốt!")
# Recommendations
print(f"\n{'='*70}")
print("💡 KHUYẾN NGHỊ PYTHON PROFILING TOOLS")
print(f"{'='*70}")
print("""
📦 Đã test xong E2E bottleneck. Để đi sâu hơn, dùng các tools sau:
1. py-spy (Sampling profiler - KHÔNG cần sửa code)
pip install py-spy
py-spy top --pid <PID> # Real-time CPU profiling
py-spy record -o flame.svg -- python server.py # Flame graph
2. pyinstrument (Call-stack profiler - SIÊU DỄ DÙNG)
pip install pyinstrument
# Wrap trong code:
from pyinstrument import Profiler
profiler = Profiler(async_mode="enabled")
profiler.start()
# ... code ...
profiler.stop()
profiler.print() # Hoặc profiler.output_html()
3. yappi (Async-aware profiler - TỐT NHẤT cho asyncio)
pip install yappi
yappi.set_clock_type("wall") # Real time, không chỉ CPU
yappi.start()
# ... run queries ...
yappi.stop()
yappi.get_func_stats().print_all()
4. scalene (AI-powered profiler - CPU + Memory + GPU)
pip install scalene
scalene server.py # Tự analyze bottleneck bằng AI!
5. memray (Memory profiler - từ Bloomberg)
pip install memray
memray run server.py
memray flamegraph output.bin # Memory flame graph
6. OpenTelemetry (Distributed tracing - ĐÃ CÓ TRONG requirements.txt)
→ Đã có opentelemetry-sdk, chỉ cần enable traces!
""")
print("=" * 70)
print("✅ Profiling hoàn tất!")
print("=" * 70)
# =====================================================================
# MAIN
# =====================================================================
async def main():
parser = argparse.ArgumentParser(description="CANIFA Chatbot Bottleneck Profiler")
parser.add_argument("--url", default=DEFAULT_URL, help="Base URL of the API")
parser.add_argument("--queries", type=int, default=5, help="Number of queries to test")
parser.add_argument("--no-stream", action="store_true", help="Disable streaming")
parser.add_argument("--custom-query", type=str, help="Test with a custom query")
parser.add_argument("--profile", choices=["cprofile", "pyinstrument", "yappi"],
help="Enable deep profiling mode")
parser.add_argument("--monitor-lag", action="store_true", help="Monitor event loop lag")
parser.add_argument("--output", type=str, help="Save JSON report to file")
args = parser.parse_args()
# Build query list
if args.custom_query:
queries = [args.custom_query]
else:
queries = DEFAULT_QUERIES[: args.queries]
print("\n" + "=" * 60)
print("🔬 CANIFA CHATBOT - BOTTLENECK PROFILER")
print("=" * 60)
print(f"🌐 Target: {args.url}")
print(f"📝 Queries: {len(queries)}")
print(f"📡 Mode: {'Non-streaming' if args.no_stream else 'Streaming'}")
if args.profile:
print(f"🔍 Profiler: {args.profile}")
print(f"{'='*60}\n")
profiler = BottleneckProfiler(args.url)
# Optional event loop lag monitor
lag_monitor = None
lag_task = None
if args.monitor_lag:
lag_monitor = EventLoopLagMonitor(threshold_ms=100)
lag_task = asyncio.create_task(lag_monitor.start())
# Optional deep profiling
deep_profiler = None
if args.profile == "pyinstrument":
try:
from pyinstrument import Profiler as PyProfiler
deep_profiler = PyProfiler(async_mode="enabled")
deep_profiler.start()
logger.info("🔍 pyinstrument profiler started")
except ImportError:
logger.warning("⚠️ pyinstrument chưa cài: pip install pyinstrument")
elif args.profile == "yappi":
try:
import yappi
yappi.set_clock_type("wall")
yappi.start()
logger.info("🔍 yappi profiler started")
except ImportError:
logger.warning("⚠️ yappi chưa cài: pip install yappi")
# Run profiling
report = await profiler.run(queries, use_streaming=not args.no_stream)
# Stop monitors
if lag_monitor:
lag_monitor.stop()
if lag_task:
lag_task.cancel()
# Stop deep profilers
if args.profile == "pyinstrument" and deep_profiler:
deep_profiler.stop()
print("\n" + "=" * 60)
print("🔍 PYINSTRUMENT RESULTS")
print("=" * 60)
deep_profiler.print()
# Save HTML report
html = deep_profiler.output_html()
with open("profiler_report.html", "w") as f:
f.write(html)
print("📄 HTML report saved: profiler_report.html")
elif args.profile == "yappi":
try:
import yappi
yappi.stop()
print("\n" + "=" * 60)
print("🔍 YAPPI TOP 30 FUNCTIONS (by total time)")
print("=" * 60)
stats = yappi.get_func_stats()
stats.sort("ttot", "desc")
stats.print_all(
columns={
0: ("name", 60),
1: ("ncall", 8),
2: ("ttot", 10),
3: ("tsub", 10),
4: ("tavg", 10),
},
out=sys.stdout,
)
except ImportError:
pass
# Print report
print_report(report)
# Event loop lag summary
if lag_monitor:
print(f"\n⚡ Event Loop Lag: {lag_monitor.summary}")
# Save JSON output
if args.output:
output_data = {
"system": report.system_info,
"summary": {
"total_queries": len(report.results),
"avg_response_time": report.avg_response_time,
"p50": report.p50_response_time,
"p95": report.p95_response_time,
"avg_ttft": report.avg_ttft,
"total_duration": report.total_duration,
},
"results": [
{
"query": r.query,
"total_time": r.total_time,
"ttft": r.ttft,
"tokens": r.token_count,
"chunks": r.chunks_received,
"status": r.status_code,
"error": r.error,
"has_products": r.has_products,
"memory_mb": r.memory_after_mb,
"cpu_percent": r.cpu_percent,
}
for r in report.results
],
}
with open(args.output, "w", encoding="utf-8") as f:
json.dump(output_data, f, indent=2, ensure_ascii=False)
print(f"\n📄 JSON report saved: {args.output}")
if __name__ == "__main__":
if platform.system() == "Windows":
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
asyncio.run(main())
"""
🔬 CANIFA Chatbot - STRESS PROFILER (Clear Cache + Multi-User)
===============================================================
Test bottleneck kỹ càng:
1. Clear Redis cache (cold start thật)
2. Test 1 user tuần tự (baseline)
3. Test multi-user đồng thời (concurrent)
4. So sánh Cold vs Warm performance
Dùng:
python tests/profiler_stress.py # Full test (clear cache + 1 user + 3 users)
python tests/profiler_stress.py --users 5 # 5 users đồng thời
python tests/profiler_stress.py --skip-clear # Không clear cache
python tests/profiler_stress.py --warm-only # Chỉ test warm (có cache)
"""
import argparse
import asyncio
import json
import logging
import os
import platform
import statistics
import sys
import time
from dataclasses import dataclass, field
from pathlib import Path
sys.path.insert(0, str(Path(__file__).parent.parent))
import httpx
try:
import redis.asyncio as aioredis
except ImportError:
aioredis = None
print("⚠️ redis package không có, skip cache clear")
try:
import psutil
except ImportError:
psutil = None
logging.basicConfig(level=logging.INFO, format="%(message)s")
logger = logging.getLogger("stress_profiler")
# =====================================================================
# CONFIG
# =====================================================================
API_URL = "http://localhost:5000"
CHAT_ENDPOINT = "/api/agent/chat-dev"
# Redis config (from config.py defaults)
REDIS_HOST = os.getenv("REDIS_CACHE_URL", "172.16.2.192")
REDIS_PORT = int(os.getenv("REDIS_CACHE_PORT", "6379"))
REDIS_DB = int(os.getenv("REDIS_CACHE_DB", "2"))
REDIS_PASSWORD = os.getenv("REDIS_PASSWORD", None)
REDIS_USERNAME = os.getenv("REDIS_USERNAME", None)
TEST_QUERIES = [
"Có áo polo nam màu xanh navy không?",
"Tìm váy liền cho bé gái 5 tuổi",
"Gợi ý outfit đi biển mùa hè",
"Áo khoác gió nam size L giá dưới 500k",
"Quần jeans nữ ống rộng màu đen",
]
# =====================================================================
# DATA
# =====================================================================
@dataclass
class QueryMetric:
query: str
user_id: str
total_time: float = 0.0
status_code: int = 0
error: str = ""
has_products: bool = False
tokens: int = 0
phase: str = "" # "cold" or "warm"
@dataclass
class PhaseResult:
phase: str
metrics: list[QueryMetric] = field(default_factory=list)
@property
def success_times(self):
return [m.total_time for m in self.metrics if m.status_code == 200]
@property
def avg(self):
t = self.success_times
return statistics.mean(t) if t else 0
@property
def p50(self):
t = sorted(self.success_times)
return t[len(t) // 2] if t else 0
@property
def p95(self):
t = sorted(self.success_times)
return t[int(len(t) * 0.95)] if t else 0
@property
def min_t(self):
t = self.success_times
return min(t) if t else 0
@property
def max_t(self):
t = self.success_times
return max(t) if t else 0
@property
def success_count(self):
return sum(1 for m in self.metrics if m.status_code == 200)
@property
def error_count(self):
return sum(1 for m in self.metrics if m.error)
# =====================================================================
# CACHE MANAGEMENT
# =====================================================================
async def clear_redis_cache() -> dict:
"""Clear toàn bộ Redis cache (resp_cache + emb_cache)."""
if not aioredis:
return {"status": "skipped", "reason": "redis package not installed"}
try:
conn_kwargs = {
"host": REDIS_HOST,
"port": REDIS_PORT,
"db": REDIS_DB,
"decode_responses": True,
"socket_connect_timeout": 5,
}
if REDIS_PASSWORD:
conn_kwargs["password"] = REDIS_PASSWORD
if REDIS_USERNAME:
conn_kwargs["username"] = REDIS_USERNAME
client = aioredis.Redis(**conn_kwargs)
await client.ping()
# Count keys before
resp_keys = await client.keys("resp_cache:*")
emb_keys = await client.keys("emb_cache:*")
counts = {
"resp_cache_before": len(resp_keys),
"emb_cache_before": len(emb_keys),
}
# Delete all cache keys (but keep other keys like prompt_version)
deleted = 0
if resp_keys:
deleted += await client.delete(*resp_keys)
if emb_keys:
deleted += await client.delete(*emb_keys)
counts["deleted"] = deleted
# Verify
resp_after = await client.keys("resp_cache:*")
emb_after = await client.keys("emb_cache:*")
counts["resp_cache_after"] = len(resp_after)
counts["emb_cache_after"] = len(emb_after)
await client.aclose()
return {"status": "cleared", **counts}
except Exception as e:
return {"status": "error", "error": str(e)}
async def get_cache_stats() -> dict:
"""Lấy thông tin cache hiện tại."""
if not aioredis:
return {"status": "unavailable"}
try:
conn_kwargs = {
"host": REDIS_HOST,
"port": REDIS_PORT,
"db": REDIS_DB,
"decode_responses": True,
"socket_connect_timeout": 5,
}
if REDIS_PASSWORD:
conn_kwargs["password"] = REDIS_PASSWORD
if REDIS_USERNAME:
conn_kwargs["username"] = REDIS_USERNAME
client = aioredis.Redis(**conn_kwargs)
await client.ping()
resp_keys = await client.keys("resp_cache:*")
emb_keys = await client.keys("emb_cache:*")
await client.aclose()
return {
"resp_cache_keys": len(resp_keys),
"emb_cache_keys": len(emb_keys),
}
except Exception as e:
return {"error": str(e)}
# =====================================================================
# QUERY RUNNER
# =====================================================================
async def run_query(client: httpx.AsyncClient, query: str, user_id: str, phase: str) -> QueryMetric:
"""Chạy 1 query, đo thời gian."""
metric = QueryMetric(query=query, user_id=user_id, phase=phase)
payload = {"user_query": query, "user_id": user_id}
start = time.perf_counter()
try:
response = await client.post(
f"{API_URL}{CHAT_ENDPOINT}",
json=payload,
timeout=180.0,
)
metric.total_time = time.perf_counter() - start
metric.status_code = response.status_code
if response.status_code == 200:
body = response.text
metric.has_products = "product_ids" in body.lower()
metric.tokens = len(body.split())
except httpx.TimeoutException:
metric.total_time = time.perf_counter() - start
metric.error = "TIMEOUT (>180s)"
except Exception as e:
metric.total_time = time.perf_counter() - start
metric.error = str(e)
return metric
# =====================================================================
# TEST PHASES
# =====================================================================
async def run_sequential(queries: list[str], user_id: str, phase: str) -> PhaseResult:
"""Chạy queries tuần tự - 1 user."""
result = PhaseResult(phase=phase)
async with httpx.AsyncClient() as client:
for i, query in enumerate(queries, 1):
logger.info(f" [{phase.upper()}] {i}/{len(queries)}: {query[:40]}...")
metric = await run_query(client, query, user_id, phase)
result.metrics.append(metric)
status = "✅" if not metric.error else "❌"
logger.info(f" {status} {metric.total_time:.2f}s | Products: {metric.has_products}")
# Nghỉ 1s giữa queries
if i < len(queries):
await asyncio.sleep(1)
return result
async def run_concurrent(queries: list[str], num_users: int, phase: str) -> PhaseResult:
"""Chạy queries đồng thời - nhiều users."""
result = PhaseResult(phase=phase)
async def user_worker(user_idx: int, query: str):
user_id = f"stress_user_{user_idx}"
async with httpx.AsyncClient() as client:
metric = await run_query(client, query, user_id, phase)
return metric
# Mỗi user gửi 1 query cùng lúc
tasks = []
for i in range(num_users):
query = queries[i % len(queries)]
tasks.append(user_worker(i + 1, query))
logger.info(f" [{phase.upper()}] Launching {num_users} concurrent users...")
start = time.perf_counter()
metrics = await asyncio.gather(*tasks, return_exceptions=True)
total = time.perf_counter() - start
logger.info(f" [{phase.upper()}] All {num_users} users finished in {total:.1f}s")
for m in metrics:
if isinstance(m, Exception):
result.metrics.append(QueryMetric(
query="error", user_id="unknown", error=str(m), phase=phase
))
else:
result.metrics.append(m)
return result
# =====================================================================
# REPORT
# =====================================================================
def print_phase_table(phase: PhaseResult):
"""In bảng kết quả cho 1 phase."""
print(f"\n {'User':<15} {'Query':<35} {'Time':>8} {'Status':>8}")
print(f" {'─'*68}")
for m in phase.metrics:
q = m.query[:33] + ".." if len(m.query) > 33 else m.query
status = "✅" if not m.error else f"❌ {m.error[:10]}"
print(f" {m.user_id:<15} {q:<35} {m.total_time:>7.2f}s {status:>8}")
def print_comparison(phases: list[PhaseResult]):
"""So sánh giữa các phases."""
print(f"\n {'Phase':<25} {'Avg':>8} {'P50':>8} {'P95':>8} {'Min':>8} {'Max':>8} {'OK':>5} {'Err':>5}")
print(f" {'─'*78}")
for p in phases:
if p.success_times:
print(
f" {p.phase:<25} {p.avg:>7.2f}s {p.p50:>7.2f}s {p.p95:>7.2f}s "
f"{p.min_t:>7.2f}s {p.max_t:>7.2f}s {p.success_count:>5} {p.error_count:>5}"
)
else:
print(f" {p.phase:<25} {'N/A':>8} {'N/A':>8} {'N/A':>8} {'N/A':>8} {'N/A':>8} {0:>5} {p.error_count:>5}")
def print_full_report(phases: list[PhaseResult], cache_before: dict, cache_after: dict):
"""In báo cáo tổng hợp."""
print("\n" + "=" * 75)
print("🔬 BÁO CÁO STRESS TEST - CANIFA CHATBOT")
print("=" * 75)
# System
print(f"\n📦 System: {platform.platform()}")
print(f" Python: {platform.python_version()} | CPUs: {os.cpu_count()}")
if psutil:
vm = psutil.virtual_memory()
print(f" RAM: {vm.available / (1024**3):.1f}GB free / {vm.total / (1024**3):.1f}GB total")
# Cache info
print(f"\n📦 Redis Cache:")
print(f" Before: resp={cache_before.get('resp_cache_keys', '?')}, emb={cache_before.get('emb_cache_keys', '?')}")
print(f" After: resp={cache_after.get('resp_cache_keys', '?')}, emb={cache_after.get('emb_cache_keys', '?')}")
# Per-phase details
for phase in phases:
print(f"\n{'─'*75}")
print(f"📊 Phase: {phase.phase}")
print_phase_table(phase)
# Comparison
print(f"\n{'='*75}")
print("📊 SO SÁNH CÁC PHASE")
print(f"{'='*75}")
print_comparison(phases)
# Bottleneck analysis
print(f"\n{'='*75}")
print("🎯 PHÂN TÍCH BOTTLENECK")
print(f"{'='*75}")
cold = next((p for p in phases if "cold" in p.phase.lower()), None)
warm = next((p for p in phases if "warm" in p.phase.lower()), None)
concurrent = next((p for p in phases if "concurrent" in p.phase.lower()), None)
if cold and warm and cold.success_times and warm.success_times:
speedup = cold.avg / warm.avg if warm.avg > 0 else 0
cache_impact = cold.avg - warm.avg
print(f"\n 🧊 Cold (no cache) Avg: {cold.avg:.2f}s")
print(f" 🔥 Warm (cached) Avg: {warm.avg:.2f}s")
print(f" ⚡ Cache speedup: {speedup:.1f}x ({cache_impact:.1f}s faster)")
if cache_impact > 3:
print(f" 🔴 Cache impact LỚN ({cache_impact:.1f}s): Embedding API là bottleneck chính!")
print(f" → Khuyến nghị: Cache warming, local embedding model")
elif cache_impact > 1:
print(f" 🟡 Cache impact VỪA ({cache_impact:.1f}s): Có cải thiện đáng kể")
else:
print(f" 🟢 Cache impact NHỎ ({cache_impact:.1f}s): Bottleneck ở chỗ khác (LLM hoặc DB)")
if concurrent and concurrent.success_times:
baseline_avg = warm.avg if (warm and warm.success_times) else (cold.avg if cold and cold.success_times else 0)
if baseline_avg > 0:
slowdown = concurrent.avg / baseline_avg
print(f"\n 👥 Concurrent ({len(concurrent.metrics)} users) Avg: {concurrent.avg:.2f}s")
print(f" 👤 Single user Avg: {baseline_avg:.2f}s")
print(f" 📈 Concurrency penalty: {slowdown:.1f}x")
if slowdown > 2:
print(f" 🔴 CRITICAL: {slowdown:.1f}x chậm hơn khi multi-user!")
print(f" → Có thể do: Connection pool exhaustion, GIL, rate limit LLM API")
elif slowdown > 1.3:
print(f" 🟡 Moderate penalty: Hệ thống chịu tải OK nhưng chậm hơn")
else:
print(f" 🟢 Minimal penalty: Hệ thống scale tốt!")
# Overall verdict
all_times = []
for p in phases:
all_times.extend(p.success_times)
if all_times:
overall_avg = statistics.mean(all_times)
print(f"\n 📊 Overall Avg: {overall_avg:.2f}s across {len(all_times)} queries")
if overall_avg > 15:
print(" 🔴 VERDICT: CHẬM - Cần tối ưu urgently")
elif overall_avg > 10:
print(" 🟡 VERDICT: CHẤP NHẬN ĐƯỢC - Multi-tool + LLM flow")
elif overall_avg > 5:
print(" 🟢 VERDICT: TỐT cho agentic chatbot")
else:
print(" ⭐ VERDICT: XUẤT SẮC!")
print("\n" + "=" * 75)
# =====================================================================
# MAIN
# =====================================================================
async def main():
parser = argparse.ArgumentParser(description="CANIFA Chatbot Stress Profiler")
parser.add_argument("--users", type=int, default=3, help="Concurrent users (default: 3)")
parser.add_argument("--queries", type=int, default=5, help="Queries per phase (default: 5)")
parser.add_argument("--skip-clear", action="store_true", help="Skip cache clearing")
parser.add_argument("--warm-only", action="store_true", help="Only test warm (skip cold)")
parser.add_argument("--output", type=str, help="Save JSON report")
args = parser.parse_args()
queries = TEST_QUERIES[:args.queries]
print("\n" + "=" * 60)
print("🔬 CANIFA CHATBOT - STRESS PROFILER")
print("=" * 60)
print(f"🌐 Target: {API_URL}")
print(f"📝 Queries/phase: {len(queries)}")
print(f"👥 Concurrent: {args.users} users")
print(f"🧹 Clear cache: {'No' if args.skip_clear else 'Yes'}")
print(f"{'='*60}\n")
# Health check
async with httpx.AsyncClient() as client:
try:
r = await client.get(f"{API_URL}/health", timeout=5.0)
if r.status_code != 200:
logger.error(f"❌ Server unhealthy: {r.status_code}")
return
logger.info(f"✅ Server healthy: {API_URL}")
except Exception as e:
logger.error(f"❌ Không kết nối được: {e}")
return
phases = []
# ── Phase 0: Cache before ──
cache_before = await get_cache_stats()
logger.info(f"📦 Cache trước test: {cache_before}")
# ── Phase 1: COLD TEST (clear cache first) ──
if not args.warm_only:
if not args.skip_clear:
logger.info("\n🧹 CLEARING CACHE...")
clear_result = await clear_redis_cache()
logger.info(f" Result: {clear_result}")
await asyncio.sleep(1)
logger.info("\n🧊 PHASE 1: COLD TEST (1 user, no cache)")
cold_result = await run_sequential(queries, "cold_tester", "🧊 Cold (1 user)")
phases.append(cold_result)
await asyncio.sleep(3) # Let system settle
# ── Phase 2: WARM TEST (cache populated from Phase 1) ──
logger.info("\n🔥 PHASE 2: WARM TEST (1 user, with cache)")
warm_result = await run_sequential(queries, "warm_tester", "🔥 Warm (1 user)")
phases.append(warm_result)
await asyncio.sleep(3)
# ── Phase 3: CONCURRENT TEST ──
logger.info(f"\n👥 PHASE 3: CONCURRENT TEST ({args.users} users)")
concurrent_result = await run_concurrent(queries, args.users, f"👥 Concurrent ({args.users} users)")
phases.append(concurrent_result)
# ── Cache after ──
cache_after = await get_cache_stats()
# ── Report ──
print_full_report(phases, cache_before, cache_after)
# ── Save JSON ──
if args.output:
output_data = {
"cache_before": cache_before,
"cache_after": cache_after,
"phases": [
{
"phase": p.phase,
"avg": p.avg,
"p50": p.p50,
"p95": p.p95,
"min": p.min_t,
"max": p.max_t,
"success": p.success_count,
"errors": p.error_count,
"queries": [
{
"query": m.query,
"user": m.user_id,
"time": round(m.total_time, 3),
"status": m.status_code,
"error": m.error,
"products": m.has_products,
}
for m in p.metrics
],
}
for p in phases
],
}
with open(args.output, "w", encoding="utf-8") as f:
json.dump(output_data, f, indent=2, ensure_ascii=False)
print(f"\n📄 JSON report: {args.output}")
print("\n✅ Stress test hoàn tất!\n")
if __name__ == "__main__":
if platform.system() == "Windows":
asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
asyncio.run(main())
"""
Create Google Sheet using Sheets API v4 directly (not gspread).
This avoids Drive quota issues by creating the spreadsheet via Sheets API.
"""
import json
import sys
from pathlib import Path
from google.oauth2.service_account import Credentials
from googleapiclient.discovery import build
CREDENTIALS_FILE = Path(__file__).parent / "google_credentials.json"
SCOPES = [
"https://www.googleapis.com/auth/spreadsheets",
"https://www.googleapis.com/auth/drive",
]
TEST_QUESTIONS = [
"Tìm cho mình chân váy màu đỏ",
"Tìm quần màu đỏ",
"Tìm áo polo nam",
"Tìm áo khoác nữ mùa đông",
"Mình muốn mua đồ đi biển, gợi ý cho mình",
"Cho mình xem áo sơ mi đi làm",
"Gợi ý outfit đi dự tiệc",
"Áo size S giá dưới 500k",
"Có khuyến mãi gì không?",
"Cách đặt hàng online",
"Cửa hàng nào gần nhất ở Hà Nội",
"Xin chào",
"Cảm ơn bạn",
"Tìm sản phẩm abc123 không tồn tại",
]
def main():
creds = Credentials.from_service_account_file(str(CREDENTIALS_FILE), scopes=SCOPES)
sheets_service = build("sheets", "v4", credentials=creds)
drive_service = build("drive", "v3", credentials=creds)
# Create spreadsheet via Sheets API
spreadsheet_body = {
"properties": {"title": "Canifa Chatbot Test Results"},
"sheets": [{
"properties": {"title": "Test Questions"},
}]
}
print("📝 Creating spreadsheet via Sheets API...")
result = sheets_service.spreadsheets().create(body=spreadsheet_body).execute()
sheet_id = result["spreadsheetId"]
sheet_url = result["spreadsheetUrl"]
print(f"✅ Created! ID: {sheet_id}")
print(f"📊 URL: {sheet_url}")
# Write headers + data
headers = ["STT", "Câu hỏi test", "Câu trả lời", "Thời gian (ms)", "Trạng thái"]
values = [headers]
for i, q in enumerate(TEST_QUESTIONS, 1):
values.append([i, q, "", "", "⏳ Đang chờ..."])
sheets_service.spreadsheets().values().update(
spreadsheetId=sheet_id,
range="Test Questions!A1:E15",
valueInputOption="RAW",
body={"values": values}
).execute()
print(f"✅ Wrote {len(TEST_QUESTIONS)} questions")
# Share with anyone (link)
try:
drive_service.permissions().create(
fileId=sheet_id,
body={"type": "anyone", "role": "writer"},
fields="id"
).execute()
print("✅ Shared as 'anyone with link can edit'")
except Exception as e:
print(f"⚠️ Could not share: {e}")
# Save sheet info
info = {"sheet_url": sheet_url, "sheet_id": sheet_id}
info_path = Path(__file__).parent / "sheet_info.json"
info_path.write_text(json.dumps(info, indent=2))
print(f"💾 Saved to {info_path}")
if __name__ == "__main__":
main()
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment