跳转至

LLM Cody Wiki

Glossary

Glossary¶

LLM: Large Language Model.
RAG: Retrieval-Augmented Generation; combines retrieval with generation.
Token: Unit of text used by tokenizers; affects cost and limits.
Quantization: Reducing numerical precision (e.g., FP16→INT4) to speed/fit.
Beam Search: Decoding strategy exploring multiple candidates.
KV Cache: Cached key/value tensors to speed autoregressive decoding.
Guardrails: Controls to steer or constrain model outputs.