Comparison with Other Libraries#

This chapter compares LMQL with other Python libraries for language model use.

Comparison with guidance#

Guidance is a templating language for large language models. It is a Python library with a Handlebars-like syntax. To highlight the benefits of LMQL and Guidance, we compare across several dimensions:

LMQL

Guidance

Language

Syntax

Python Syntax

Handlebars ({{...}})

Control-Flow

Full Python Support

Provides ({{#if ...}, {{#each ...}, …)

Function Calls

Call Any Python Function

{{func <args>}} where func is passed as template parameter

Python Integration

LMQL programs act as native Python functions (capture variables, can be class method)

Template function have no access to surrounding program context

Async API

Async API allowing you to run hundreds of queries in parallel (including cross-query optimization and batching)

-

Decoding

sample/argmax

Advanced decoders (beam search, best_k, var, …)

✅ Decoders

-

Multi-Variable Templates

Vary multiple decoders In templates

In Development

Conditional distributions

distribution clause

-

Model Support

OpenAI API

Azure OpenAI

🤗 Transformers

Constraints

Simple Token Length Constraints

RESULT in [a, b, …]

Character-level constraints (number of words, character length)

-

Datatype Constraints (e.g. integer only)

-

Extendible constraint system

Formal Semantics + Extendible + [Paper]

-

Advanced Applications

JSON Decoding

Type Constraints (Preview Release) and Internal Implementation

Snippet

Role Tags

Playground (ChatGPT only for now)

Snippet

Tool Use

Calculator, Search

Search

Generating Tabular Data

LMQL and Pandas

-

Algorithmic Prompting

LLM-based Sorting Algorithms

-

Interactive Chat Interface

Chat in the Playground

-

Code Interpreter

Execute Python Code in LMQL

-

Inline Tool Use

Calculator, Key-Value Storage

-

Runtime Optimization

Tree-based Token Caching

(Blog)

-

Transformers Key-Value Caching

In Development

✅ (guidance acceleration)

Cache Persistence across multiple runs

(Blog)

-

Token Healing

In Development

Eager Constraint Evaluation and Short-Circuiting

(Blog)

-

Library Integration

Langchain

LMQL queries can be used seamlessly as LangChain Chain objects

-

LlamaIndex

LMQL can directly call and leverage LlamaIndex data structures during decoding

-

Tooling

Interactive Use

✅ Interactive Playground IDE with visual decoding tree and editor

✅ Jupyter Notebook Integration

Visual Studio Code

✅ Extension

-

Output (Streaming)

Streaming Model Output

websocket streaming

✅ GitHub

-

REST endpoint

✅ GitHub

-

Server-Sent Event streaming

✅ GitHub

-