Skip to main content

Retrieval for Structured Data

Motivation​

Language models (LLMs) are trained on vast but fixed datasets, which limits their ability to access up-to-date or domain-specific information.

To enhance their performance on specific tasks, we can augment their knowledge using retrieval systems.

Retrieval systems fetch relevant information from external sources, which can then be included in the prompt given to the model.

Key benefits of using retrieval systems include:

  1. Access to recent or private information
  2. Improved accuracy on domain-specific tasks
  3. Reduced hallucination by grounding responses in retrieved facts
  4. Cost-effective alternative to fine-tuning for factual recall

Structured Data​

A large fraction of the world's operational data is structured, often organized into database tables with a specific schema.

Various DSLs (Domain Specific Languages) have been developed to interact with these systems including SQL, Cypher, and PQL.

Query Construction​

A popular approach to interacting with structured data is to use an LLM to convert natural language queries into a DSL for the relevant database.

In particular, text-to-SQL and text-to-Cypher are useful ways to interact with structured and graph databases respectively.

NameWhen to UseDescription
Text to SQLIf users are asking questions that require information housed in a relational database, accessible via SQL.This uses an LLM to transform user input into a SQL query.
Text-to-CypherIf users are asking questions that require information housed in a graph database, accessible via Cypher.This uses an LLM to transform user input into a Cypher query.

See our tutorials on text-to-SQL and text-to-Cypher for more details.

tip

See our blog post overview and RAG from Scratch video on query construction.


Was this page helpful?


You can also leave detailed feedback on GitHub.