Datalog is a declarative logic programming language that syntactically is a subset of Prolog. It is often used as a query language for deductive databases. In recent years, Datalog has found new application in data integration, information extraction, networking, program analysis, security, and cloud computing.[1]

Its origins date back to the beginning of logic programming, but it became prominent as a separate area around 1977 when Hervé Gallaire and Jack Minker organized a workshop on logic and databases.[2]David Maier is credited with coining the term Datalog.[3]

Features, limitations and extensions

Unlike in Prolog, statements of a Datalog program can be stated in any order. Furthermore, Datalog queries on finite sets are guaranteed to terminate, so Datalog does not have Prolog's cut operator. This makes Datalog a truly declarative language.

In contrast to Prolog, Datalog

  1. disallows complex terms as arguments of predicates, e.g., p (1, 2) is admissible but not p (f (1), 2),
  2. imposes certain stratification restrictions on the use of negation and recursion,
  3. requires that every variable that appears in the head of a clause also appears in a nonarithmetic positive (i.e. not negated) literal in the body of the clause,
  4. requires that every variable appearing in a negative literal in the body of a clause also appears in some positive literal in the body of the clause[4]

Query evaluation with Datalog is based on first-order logic, and is thus sound and complete. However, Datalog is not Turing complete, and is thus used as a domain-specific language that can take advantage of efficient algorithms developed for query resolution. Indeed, various methods have been proposed to efficiently perform queries, e.g., the Magic Sets algorithm,[5] tabled logic programming[6] or SLG resolution.[7]

Some widely used database systems include ideas and algorithms developed for Datalog. For example, the SQL:1999 standard includes recursive queries, and the Magic Sets algorithm (initially developed for the faster evaluation of Datalog queries) is implemented in IBM's DB2.[8] Moreover, Datalog engines are behind specialised database systems such as Intellidimension's database for the semantic web.[]

Several extensions have been made to Datalog, e.g., to support aggregate functions, to allow object-oriented programming, or to allow disjunctions as heads of clauses. These extensions have significant impacts on the definition of Datalog's semantics and on the implementation of a corresponding Datalog interpreter.


Example Datalog program:

 parent(bill, mary).
 parent(mary, john).

These two lines define two facts, i.e. things that always hold. They can be intuitively understood as: the parent of mary is bill and the parent of john is mary.

 ancestor(X,Y) :- parent(X,Y).
 ancestor(X,Y) :- parent(X,Z),ancestor(Z,Y).

These two lines describe the rules that define the ancestor relationship. A rule consists of two main parts separated by the :- symbol. The part to the left of this symbol is the head of the rule, the part to the right is the body. A rule is read (and can be intuitively understood) as <head> if it is known that <body>. Uppercase letters stand for variables. Hence in the example the first rule can be read as X is the ancestor of Y if it is known that X is the parent of Y. And the second rule as X is the ancestor of Y if it is known that X is the parent of some Z and Z is the ancestor of Y. The ordering of the clauses is irrelevant in Datalog in contrast to Prolog which depends on the ordering of clauses for computing the result of the query call.

Datalog distinguishes between Extensional predicate symbols (defined by facts) and intensional predicate symbols (defined by rules).[9] In the example above ancestor is an intensional predicate symbol, and parent is extensional. Predicates may also be defined by facts and rules and therefore neither be purely extensional nor intensional, but any Datalog program can be rewritten into an equivalent program without such predicate symbols with duplicate roles.

 ?- ancestor(bill,X).

The query above asks for all that bill is ancestor of, and would return mary and john when posed against a Datalog system containing the facts and rules described above.

Systems implementing Datalog

Here is a short list of systems that are either based on Datalog or provide a Datalog interpreter:

Free software/Open source

Written in Name Try it online External Database Description Licence
In Java IRIS[10] IRIS extends Datalog with function symbols, built-in predicates, locally stratified or un-stratified logic programs (using the well-founded semantics), unsafe rules and XML schema data types (GNU LGPL v2.1).
Jena a Semantic Web framework which includes a Datalog implementation as part of its general purpose rule engine, which provides OWL and RDFS support.[11] (Apache v2)
SociaLite[12] SociaLite is a datalog variant for large-scale graph analysis developed in Stanford (Apache v2)
Graal[13] Graal is a Java toolkit dedicated to querying knowledge bases within the framework of existential rules, aka Datalog+/-. (CeCILL v2.1)
Flix[14] yes[15] A functional and logic programming language inspired by Datalog extended with user-defined lattices and monotone filter/transfer functions. (Apache v2)
In C XSB A logic programming and deductive database system for Unix and MS Windows with tabling giving Datalog-like termination and efficiency, including incremental evaluation[16] (GNU LGPL).
In C++ Coral[17] A deductive database system written in C++ with semi-naïve datalog evaluation. Developed 1988-1997. (custom licence, free for non-commercial use).
Inter4QL[18] an open-source command-line interpreter of Datalog-like 4QL query language implemented in C++ for Windows, Mac OS X and Linux. Negation is allowed in heads and bodies of rules as well as in recursion (GNU GPL v3).
RDFox[19] RDF triple store with Datalog reasoning. Implements the FBF algorithm for incremental evaluation. (custom licence, free for non-commercial use[20])
Souffle[21] an open-source Datalog-to-C++ compiler converting Datalog into high-performance, parallel C++ code, specifically designed for complex Datalog queries over large data sets as e.g. encountered in the context of static program analysis (UPL v1.0)
In Python pyDatalog 11 dialects of SQL adds logic programming to python's toolbox. It can run logic queries on databases or python objects, and use logic clauses to define the behavior of python classes. (GNU LGPL)
In Ruby bloom / bud A Ruby DSL for programming with data-centric constructs, based on the Dedalus extension of Datalog which adds a temporal dimension to the logic. (BSD 3-Clause)
In Lua Datalog[22] yes[23] a lightweight deductive database system. (GNU LGPL).
In Prolog DES[24] an open-source implementation to be used for teaching Datalog in courses (GNU LGPL).
In Clojure Cascalog Hadoop a Clojure library for querying data stored on Hadoop clusters (Apache).
Clojure Datalog a contributed library implementing aspects of Datalog (Eclipse Public License 1.0).
Datascript in-memory Immutable database and Datalog query engine that runs in the browser (Eclipse Public License 1.0).
In Racket Datalog for Racket[25] (GNU LGPL).
Datafun[26] Generalized Datalog on Semilattices (GNU LGPL).
In Tcl tclbdd[27] Implementation based on binary decision diagrams. Built to support development of an optimizing compiler for Tcl. (BSD).
In Haskell Dyna[28] Dyna is a declarative programming language for statistical AI programming. The language is based on Datalog, supports both forward and backward chaining, and incremental evaluation. (GNU AGPLv3).
In other or unknown languages bddbddb[29] an implementation of Datalog done at Stanford University. It is mainly used to query Java bytecode including points-to analysis on large Java programs (GNU LGPL).
ConceptBase[30] a deductive and object-oriented database system based on a Datalog query evaluator : Prolog for triggered procedures and rewrites, axiomatized Datalog called « Telos » for (meta)modeling. It is mainly used for conceptual modeling and metamodeling (FreeBSD-style license). Prolog, Java, C++.

Non-free software

  • Datomic is a distributed database designed to enable scalable, flexible and intelligent applications, running on new cloud architectures. It uses Datalog as the query language.
  • DLV is a commercial Datalog extension that supports disjunctive head clauses.
  • FoundationDB provides a free-of-charge database binding for pyDatalog, with a tutorial on its use.[31]
  • Leapsight Semantic Dataspace (LSD) is a distributed deductive database that offers high availability, fault tolerance, operational simplicity, and scalability. LSD uses Leaplog (a Datalog implementation) for querying and reasoning and was create by Leapsight.[32]
  • LogicBlox, a commercial implementation of Datalog used for web-based retail planning and insurance applications.
  • Profium Sense is a native RDF compliant graph database written in Java. It provides Datalog evaluation support of user defined rules.
  • .QL, a commercial object-oriented variant of Datalog created by Semmle.[33]
  • SecPAL a security policy language developed by Microsoft Research.[34]
  • Stardog is a graph database, implemented in Java. It provides support for RDF and all OWL 2 profiles providing extensive reasoning capabilities, including datalog evaluation.
  • StrixDB: a commercial RDF graph store, SPARQL compliant with Lua API and Datalog inference capabilities. Could be used as httpd (Apache HTTP Server) module or standalone (although beta versions are under the Perl Artistic License 2.0).

See also


  1. ^ Huang, Green, and Loo, "Datalog and Emerging applications", SIGMOD 2011 (PDF), UC Davis  .
  2. ^ Gallaire, Hervé; Minker, John 'Jack', eds. (1978), "Logic and Data Bases, Symposium on Logic and Data Bases, Centre d'études et de recherches de Toulouse, 1977", Advances in Data Base Theory, New York: Plenum Press, ISBN 0-306-40060-X .
  3. ^ Abiteboul, Serge; Hull, Richard; Vianu, Victor, Foundations of databases, p. 305 .
  4. ^ Datalog
  5. ^ Bancilhon. "Magic sets and other strange ways to implement logic programs" (PDF). PT: UNL. Archived from the original (PDF) on 2012-03-08. 
  6. ^ Pfenning, Frank; Schuermann, Carsten. "Twelf User's Guide". CMU. 
  7. ^ "Efficient top-down computation of queries under the well-founded semantics" (PDF). 
  8. ^ Gryz; Guo; Liu; Zuzarte. "Query sampling in DB2 Universal Database" (PDF). 
  9. ^ Lifschitz. "Datalog Programs and Their Stable Models". DE: Springer. 
  10. ^ Iris reasoner .
  11. ^ "Jena". Source forge. 
  12. ^ SociaLite homepage, archived from the original on 2017-09-11 .
  13. ^ Graal library .
  14. ^ "Flix | The Programming Language". Retrieved . 
  15. ^ "Flix | Try Online". Retrieved . 
  16. ^ The XSB System, Version 3.7.x, Volume 1: Programmer's Manual (PDF) .
  17. ^ Coral Database Project web page .
  18. ^ 4QL .
  19. ^ RDFox web page .
  20. ^ RDFox licence .
  21. ^ Souffle Compiler .
  22. ^ Ramsdell, "Datalog", Tools, NEU .
  23. ^ Sangkok, Y, "Wrapper", Mitre Datalog, Git hub , (compiled to JavaScript).
  24. ^ Saenz-Perez, DES: A Deductive Database System, ES: ENTCS .
  25. ^ "Datalog", Racket (technical documentation) .
  26. ^ "Datafun", Datafun in Racket (Links to paper, talk and github site) .
  27. ^ Kenny, Kevin B (12-14 November 2014). Binary decision diagrams, relational algebra, and Datalog: deductive reasoning for Tcl (PDF). Twenty-first Annual Tcl/Tk Conference. Portland, Oregon. Retrieved 2015. [permanent dead link]
  28. ^ "Dyna", Dyna web page .
  29. ^ "bddbddb", Source forge .
  30. ^ ConceptBase .
  31. ^ FoundationDB Datalog Tutorial, archived from the original on 2013-08-09 .
  32. ^ "Leapsight". 
  33. ^ Semmle .
  34. ^ "SecPAL". Microsoft Research. Archived from the original on 2007-02-23. 


Further reading

  This article uses material from the Wikipedia page available here. It is released under the Creative Commons Attribution-Share-Alike License 3.0.



Connect with defaultLogic
What We've Done
Led Digital Marketing Efforts of Top 500 e-Retailers.
Worked with Top Brands at Leading Agencies.
Successfully Managed Over $50 million in Digital Ad Spend.
Developed Strategies and Processes that Enabled Brands to Grow During an Economic Downturn.
Taught Advanced Internet Marketing Strategies at the graduate level.

Manage research, learning and skills at defaultLogic. Create an account using LinkedIn or facebook to manage and organize your Digital Marketing and Technology knowledge. defaultLogic works like a shopping cart for information -- helping you to save, discuss and share.

  Contact Us