Comparison with Other Libraries

Contents

Comparison with Other Libraries#

This chapter compares LMQL with other Python libraries for language model use.

Comparison with `guidance`#

Guidance is a templating language for large language models. It is a Python library with a Handlebars-like syntax. To highlight the benefits of LMQL and Guidance, we compare across several dimensions:

	LMQL	Guidance
Language
Syntax	Python Syntax	Handlebars (`{{...}}`)
Control-Flow	Full Python Support	Provides (`{{#if ...}`, `{{#each ...}`, …)
Function Calls	Call Any Python Function	`{{func <args>}}` where `func` is passed as template parameter
Python Integration	LMQL programs act as native Python functions (capture variables, can be class method)	Template function have no access to surrounding program context
Async API	Async API allowing you to run hundreds of queries in parallel (including cross-query optimization and batching)	-

Decoding
sample/argmax	✅	✅
Advanced decoders (beam search, best_k, var, …)	✅ Decoders	-
Multi-Variable Templates	✅	✅
Vary multiple decoders In templates	In Development	✅
Conditional distributions	✅ `distribution` clause	-
Model Support
OpenAI API	✅	✅
Azure OpenAI	✅	✅
🤗 Transformers	✅	✅
Constraints
Simple Token Length Constraints	✅	✅
RESULT in [a, b, …]	✅	✅
Character-level constraints (number of words, character length)	✅	-
Datatype Constraints (e.g. integer only)	✅	-
Extendible constraint system	✅ Formal Semantics + Extendible + [Paper]	-
Advanced Applications
JSON Decoding	Type Constraints (Preview Release) and Internal Implementation	Snippet
Role Tags	Playground (ChatGPT only for now)	Snippet
Tool Use	Calculator, Search	Search
Generating Tabular Data	LMQL and Pandas	-
Algorithmic Prompting	LLM-based Sorting Algorithms	-
Interactive Chat Interface	Chat in the Playground	-
Code Interpreter	Execute Python Code in LMQL	-
Inline Tool Use	Calculator, Key-Value Storage	-
Runtime Optimization
Tree-based Token Caching	✅ (Blog)	-
Transformers Key-Value Caching	In Development	✅ (`guidance` acceleration)
Cache Persistence across multiple runs	✅ (Blog)	-
Token Healing	In Development	✅
Eager Constraint Evaluation and Short-Circuiting	✅ (Blog)	-
Library Integration
Langchain	LMQL queries can be used seamlessly as LangChain `Chain` objects	-
LlamaIndex	LMQL can directly call and leverage LlamaIndex data structures during decoding	-
Tooling
Interactive Use	✅ Interactive Playground IDE with visual decoding tree and editor	✅ Jupyter Notebook Integration
Visual Studio Code	✅ Extension	-
Output (Streaming)
Streaming Model Output	✅	✅
`websocket` streaming	✅ GitHub	-
REST endpoint	✅ GitHub	-
Server-Sent Event streaming	✅ GitHub	-