The programming language of Anglican
is a subset of Clojure, extended with a few special forms that make it a
probabilistic programming language. These forms are `sample`

for drawing a samples from distributions and `observe`

for
conditioning on data.
There are other special forms — `mem`

, `store`

, and `retrieve`

— which make writing probabilistic programs easier.

The following documentation is quite terse because
Anglican is, to a large extent, intentionally syntactially indistiguishable from Clojure.

Clojure reference
materials, obtainable from the web via standard search procedures,
are as essential to programming in Anglican as is this Anglican
language documentation. The key to Anglican knowing
Clojure *and* understanding `defquery`

, the interface between
Clojure and Anglican.

This interface is meant to be as transparent as possible and for
as much Clojure functionality to work inside `defquery`

as possible.

In general the documentation that follows indicates functionality *relative to Clojure*.
For instance, the absence of an explicit statement of existence means
that the Clojure language feature probably isn’t supported in Anglican.

Anglican programs reside in Clojure source code modules,
and are delimited by `defquery`

(a macro). In order to
enable the Anglican language in a Clojure module, at the minimum namespaces
`anglican.runtime`

and `anglican.emit`

must be used. A simple way to
do this is to write

```
(ns example
(:use [anglican emit runtime]))
```

in the beginning of a Clojure module, for example ‘example.clj’. Clojure namespacing is notably complex; arguably the best strategy for writing a namespace that includes Anglican functionality is to copy namespace declarations from the provided examples.

The rest of the module
may contain one or more Anglican programs, aka *queries*. Queries
are delimited by the keyword `defquery`

, followed by the query name
and the program text. If there is only one program in a
namespace, it customarily bears the same name as the namespace:

```
(defquery example
(let [bet (sample (beta 5 3))]
(observe (flip bet) true)
(> bet 0.7)))
```

`defquery`

assigns a value to `example`

that allows it to be passed
as the argument to a `doquery`

. `doquery`

uses inference algorithms to
produce lazy sequence of weighted samples that characterize the
conditional distribution of the value of the expression in the tail position of
the defquery, here `(> bet 0.7)`

.

Syntactically `defquery`

denotes a joint distribution, `observe`

’s denote
which random variables’ values are known, and the value of the last, “return”
expression, is the variable whose conditional distribution is of interest.

What follows is a description of what program text can go inside `defquery`

, i.e. a
description of the Anglican language. Note that while the source code inside `defquery`

intentionally bears a great deal of resemblance to Clojure, it is not Clojure code,
it is Anglican code.

The Anglican language is a subset of Clojure. Within `defquery`

,
`let`

, `if`

, `when`

, `cond`

, `case`

, `and`

, `or`

, `fn`

forms are
supported (others may be in the future but are not now).
In `let`

bindings and `fn`

argument lists,
vector destructuring (but not hash map destructuring) is
supported. Compound literals for vectors, hash maps, and
sets are supported just like in Clojure.

`recur`

Anglican is stackless, therefore `recur`

is
unnecessary, no recursive call can lead to stack overflow;
Recursive calls to functions should be used instead. However,
`loop`

/`recur`

is provided for convenience as a way to express
loops. `recur`

outside of `loop`

will lead to unpredictable
behaviour and hard-to-catch errors.

All of Clojure’s core library,
except for higher-order functions
(functions that accept other functions as arguments) is
available in Anglican. In addition, the following higher-order
functions are implemented: `map`

, `reduce`

, `filter`

, `some`

,
`repeatedly`

, `comp`

, `partial`

.

`defquery`

The border between Clojure and Anglican is subtle and usually will pose no problem to most programmers, however, some confusion can arise from the fact that Anglican programs are macro compiled into CPS-style Clojure functions. This means that some wrapping of “native” Clojure functions needs to happen in order to use them in Anglican. Errors arising due to misunderstanding this boundary crop up in the form of “wrong number of argument” exceptions. Carefully following the guidance in this section will resolve most if not all such difficulties.

Data variables may be defined outside of `defquery`

using `def`

and used inside `defquery`

. Anglican functions outside of
`defquery`

may be defined using `defm`

(with the same syntax as
`defn`

, albeit with a single arity only). Their bodies may use
the same subset of Clojure as `defquery`

, as well as
probabilistic and state access forms. `defm`

-defined functions
can be called from Anglican without restrictions.

Functions defined outside of `defquery`

using `defn`

may use the
full Clojure syntax but no Anglican extensions, and must be
declared primitive using `with-primitive-procedures`

:

```
(with-primitive-procedures [name ...]
body)
```

Where `name ...`

is the list of primitive procedures. The names
can be namespace-qualified, but will be seen unqualified in the
lexical scope of the form. For example,

```
(with-primitive-procedures [clojure.string/capitalize]
(defquery foo
(capitalize "hello")))
```

Denotes the dirac distribution over `Hello`

(capitalized).

In Anglican there are two probabilistic forms: `sample`

and `observe`

.

`(sample distribution)`

returns a sample from a`distribution`

.`(observe distribution value)`

returns value;*critical*produces conditioning side-effect. It does this by adding the value of`(observe* distribution value)`

(see below) to the log probability of the trace.

Functions can be memoized using `mem`

, which accepts a function
object as its argument. If the argument is a named `fn`

form,
self-recursive calls will call the memoized version of the
function. For example, every `fact`

call in the following code

```
(defquery fact
(let [fact (mem (fn fact [n]
(if (= n 1) 1
* n (fact (- n 1)))))]
[(fact 1) (fact 2) (fact 3) (fact 4)]))
```

will reuse previous computation.

Values can be stored in the state using `store`

, values stored
during the same run of the program can be retrieved using
`retrieve`

. The syntax is

`(store key ... value)`

stores`value`

at`key ...`

in the state.`(retrieve key ...)`

retrieves and returns the value stored at`key ...`

.`key ...`

can be a sequence of any length.

For example:

```
(defquery customer
(store :customer 4 :age 18)
(retrieve :customer 4 :age))
```

will return be 18 in :result.

Distributions are *Clojure* implementations of a `distribution`

protocol, consisting
of two methods:

**(sample* distribution)** accepts a distribution instance and returns a sample from the distribution. The *Anglican* `sample`

uses this method to generate random variable values.

**(observe* distribution value)** accepts a distribution and a value and returns log probability of the value given the distribution. *Anglican* inference backends stop at `observe`

statements and often use the return value from calling `observe*`

with the same original arguments as the `observe`

call to effect conditioning.

The core runtime library provides the following distribution constructors which can be used either in Clojure or Anglican, remembering the difference between `sample`

and `sample*`

, and, `observe`

and `observe*`

:

**(bernoulli p)** constructs a single binomial trial. Calling `sample`

on the returned distribution instance generates 1 with probability `p`

and 0 with probability `1-p`

.

**(beta a b)** constructs a Beta distribution with pseudocounts `a`

and `b`

. Calling `sample`

on the returned distribution instance generates a `double`

on interval `[0,1)`

.

**(binomial n p)** constructs a Binomial distribution with success probability `p`

and number of trials `n`

. Calling `sample`

on the returned distribution instance generates a `long`

on the interval `[0 ... n]`

.

**(categorical pairs)** constructs a categorical distribution parameterized by a list of pairs `(val p)`

. Calling `sample`

on the returned distribution instance generates `val-k`

with probability `p-k`

.

**(dirichlet [alpha-1 … alpha-K])** constructs a Dirichlet distribution parameterized by a vector of pseudocounts `alpha`

. Calling `sample`

on the returned distribution instance generates a vector of probabilities `prob`

such that `(sum prob) = 1.0`

and `(count prob) = (count alpha)`

.

**(discrete p)** constructs a discrete distribution parameterized by a list probabilities `p`

. Calling `sample`

on the returned distribution instance generates a `long`

in the range `[0 ... K-1]`

, with `K = (count p)`

. The result `k`

is returned with probability `(nth p k)`

.

**(exponential l)** constructs an exponential distribution with with rate parameter `l`

. Calling `sample`

on the returned distribution instance generates a double in the domain `[0, Inf)`

.

**(flip p)** constructs a single binomial trial. Calling `sample`

on the returned distribution instance generates `true`

with probability `p`

and `false`

with probability `1-p`

.

**(gamma a b)** constructs a Gamma distribution with shape `a`

and rate `b`

. Calling `sample`

on the returned distribution instance generates a `double`

on the domain `(0, Inf)`

.

**(mvn mean cov)** constructs a Multivariate normal distribution with mean
`mean`

and covariance matrix `cov`

. Calling `sample`

on the returned distribution instance generates a `double`

vector of the
same size as `mean`

.

**(normal mean std)** constructs a normal distribution with mean
`mean`

and standard deviation `std`

.

**(poisson l)** constructs a Poisson distribution with rate `l`

. Calling `sample`

on the returned distribution instance generates a non-negative `long`

.

**(uniform-continuous min max)** constructs a uniform continuous distribution. Calling `sample`

on the returned distribution instance generates a `double`

in the domain `[min, max]`

.

**(uniform-discrete min max)** constructs a uniform discrete distribution. Calling `sample`

on the returned distribution instance generates a `long`

from the range `[min ... max-1]`

.

**(wishart n V)** constructs a Wishart distribution with `n`

degrees of
freedom and scale matrix `V`

. Calling `sample`

on the returned distribution instance generates a matrix of `double`

of the
same size as `V`

.

In addition, so-called random processes are provided by the
runtime, including CRP (Chinese Restaurant Process), DP
(Dirichlet Process), and GP (Gaussian Process). Random processes
implement the `random-process`

protocol, consisting of two
methods:

**(produce process)** accepts a process instance and returns a
distribution object (which can be passed as a parameter to
`observe`

and `sample`

) corresponding to the current state of
the process.

**(absorb process value)** updates the process by incorporating
a value sampled or observed from the distribution produced by
the processes.

The following random process constructors are included into the core runtime library:

**(CRP alpha)** is a Chinese restaurant process with concentration
`alpha`

.

**(DP alpha H)** is a Dirichlet process with concentration `alpha`

over base distribution `H`

.

**(GP m k)** is a Gaussian process with mean function `m`

and
covariance function `k`

.

Other distributions and processes can be defined by the user.
The definition can be placed into Clojure modules containing
Anglican programs. A user-defined distribution is specified
using `defdist`

:

```
(defdist dirac
"Dirac distribution"
[x] ; distribution parameters
[] ; auxiliary bindings
(sample* [this] x)
(observe* [this value] (if (= x value) 0.0 (- (/ 1.0 0.0)))))
```

Similarly, a user-defined random process is specified using
`defproc`

:

```
(defproc DSD
"discrete-symmetric-dirichlet process"
[alpha N] ; process parameters
[counts (vec (repeat N (double alpha)))] ; auxiliary bindings
(produce [this] (discrete counts))
(absorb [this sample]
(DSD alpha N (update-in counts [sample] + 1.))))
```

Constructors of user-defined distributions and processes must be
declared primitive using `with-primitive-procedures`

.