Couchzilla
Couchzilla – CouchDB/Cloudant access for Julians.
Philosophy
We've tried to wrap the CouchDB API as thinly as possible, hiding the JSON and the HTTP but no overwrought abstractions on top. That means that a CouchDB JSON document is represented as the corresponding de-serialisation into native Julia types:
{
"_id": "45c4affe6f40c7aaf0ba533f7a6601a2",
"_rev": "1-47e8deed9ccfcf8d061f7721d3ba085c",
"item": "Malus domestica",
"prices": {
"Fresh Mart": 1.59,
"Price Max": 5.99,
"Apples Express": 0.79
}
}
is represented as
Dict{UTF8String,Any}(
"_rev" => "1-47e8deed9ccfcf8d061f7721d3ba085c",
"prices" => Dict{UTF8String,Any}("Fresh Mart"=>1.59,"Price Max"=>5.99,"Apples Express"=>0.79),
"_id" => "45c4affe6f40c7aaf0ba533f7a6601a2",
"item" => "Malus domestica"
)
Along similar lines, Couchzilla will return CouchDB's JSON-responses simply converted as-is.
CouchDB vs Cloudant
IBM Cloudant offers a clustered version of CouchDB as a service. What started out as a fork has with version 2.0 och CouchDB now largely come back togther, and Cloudant now does (nearly) all its work directly in the Apache CouchDB repos. However, some features of Cloudant makes no sense in the CouchDB context, so there are still some differences. Couchzilla tries to cover both bases, but makes no attempt to hide Cloudant-only functionality when using CouchDB.
The main differences are:
- Text indexes - Cloudant integrates with Lucene. CouchDB only has json indexes in its Mango implementation.
- Rate capping - as Cloudant sells its service in terms of provisioned throughput capacity, Cloudant will occasionally throw a 429 error indicating that the cap has been hit.
- API keys – Cloudant has a separate auth system distinct from CouchDB's
_users
database. - Geospatial indexes – Cloudant has sophisticated geospatial capabilities which are not present in CouchDB.
Getting Started
Couchzilla defines two types, Client
and Database
. Client
represents an authenticated connection to the remote CouchDB instance. Using this you can perform database-level operations, such as creating, listing and deleting databases. The Database immutable type represents a client that is connected to a specific database, allowing you to perform document-level operations.
Install the library using the normal Julia facilities Pkg.add("Couchzilla")
.
Let's load up the credentials from environment variables.
username = ENV["COUCH_USER"]
password = ENV["COUCH_PASS"]
host = ENV["COUCH_HOST_URL"] # e.g. https://accountname.cloudant.com
We can now create a client connection, and use that to create a new database:
dbname = "mynewdb"
client = Client(username, password, host)
db, created = createdb(client, dbname)
If the database already existed, created
will be set to false
on return, and true
means that the database was created.
We can now add documents to the new database using createdoc
. It returns an array of Dict
s showing the {id, rev}
tuples of the new documents:
result = createdoc(db, [
Dict("name" => "adam", "data" => "hello"),
Dict("name" => "billy", "data" => "world"),
Dict("name" => "cecilia", "data" => "authenticate"),
Dict("name" => "davina", "data" => "cloudant"),
Dict("name" => "eric", "data" => "blobbyblobbyblobby")
])
5-element Array{Any,1}:
Dict{String,Any}(Pair{String,Any}("ok",true),Pair{String,Any}("rev","1-783f91178091c10cce61c326473e8849"),Pair{String,Any}("id","93790b75ed6a59e5002cb0eddb78b42d"))
Dict{String,Any}(Pair{String,Any}("ok",true),Pair{String,Any}("rev","1-9ecba7e9a824a6fdcfb005c454fea12e"),Pair{String,Any}("id","93790b75ed6a59e5002cb0eddb78b69c"))
Dict{String,Any}(Pair{String,Any}("ok",true),Pair{String,Any}("rev","1-e05530fc65101ed432c5ee457d327952"),Pair{String,Any}("id","93790b75ed6a59e5002cb0eddb78c304"))
Dict{String,Any}(Pair{String,Any}("ok",true),Pair{String,Any}("rev","1-446bb325003aa6a995bde4e7c3dd513f"),Pair{String,Any}("id","93790b75ed6a59e5002cb0eddb78c867"))
Dict{String,Any}(Pair{String,Any}("ok",true),Pair{String,Any}("rev","1-e1f2181b3b4d7fa285b4516eee02d287"),Pair{String,Any}("id","93790b75ed6a59e5002cb0eddb78c8a1"))
This form of createdoc
creates multiple documents using a single HTTP POST
which is the most efficient way of creating multiple new documents.
We can read a document back using readdoc
, hitting the CouchDB primary index. Note that reading back a document you just created is normally bad practice, as it will sooner or later fall foul of CouchDB's eventual consistency and give rise to sporadic, hard to troubleshoot errors. Having said that, let's do it anyway, and hope for the best:
id = result[2]["id"]
readdoc(db, id)
Dict{String,Any} with 4 entries:
"_rev" => "1-9ecba7e9a824a6fdcfb005c454fea12e"
"name" => "billy"
"_id" => "93790b75ed6a59e5002cb0eddb78b69c"
"data" => "world"
returning the winning revision for the given id
as a Dict
.
Conflict handling in CouchDB and eventual consistency is beyond the scope of this documentation, but worth understanding fully before using CouchDB in anger.
Query
Mango
(also known as Cloudant Query) is a declarative query language inspired by MongoDB. It allows us to query the database in a (slightly) more ad-hoc fashion than using map reduce views.
In order to use this feature we first need to set up the necessary indexes:
mango_index(db, ["name", "data"])
Dict{String,Any} with 3 entries:
"name" => "f519be04f7f80838b6a88811f75de4fb83d966dd"
"id" => "_design/f519be04f7f80838b6a88811f75de4fb83d966dd"
"result" => "created"
We can now use this index to retrieve data:
mango_query(db, q"name=davina")
Couchzilla.QueryResult(Dict{AbstractString,Any}[Dict{AbstractString,Any}(Pair{AbstractString,Any}("_rev","1-446bb325003aa6a995bde4e7c3dd513f"),Pair{AbstractString,Any}("name","davina"),Pair{AbstractString,Any}("_id","93790b75ed6a59e5002cb0eddb78c867"),Pair{AbstractString,Any}("data","cloudant"))],"")
The construct q"..."
(see @q_str
) is a custom string literal type which takes a simplistic DSL expression which gets converted to the actual JSON-representation of a Mango selector. If you are familiar with Mango selectors, you can use the raw JSON expression if you prefer:
mango_query(db, Selector("{\"name\":{\"\$eq\":\"davina\"}}"))
Couchzilla.QueryResult(Dict{AbstractString,Any}[Dict{AbstractString,Any}(Pair{AbstractString,Any}("_rev","1-446bb325003aa6a995bde4e7c3dd513f"),Pair{AbstractString,Any}("name","davina"),Pair{AbstractString,Any}("_id","93790b75ed6a59e5002cb0eddb78c867"),Pair{AbstractString,Any}("data","cloudant"))],"")
There are also coroutine versions of some of the functions that return data from views. If we had many results to process, we could use paged_mango_query
in a Julia Task:
for page in @task paged_mango_query(db, q"name=davina"; pagesize=10)
# Do something with the page.docs array
end
This version uses the limit
and skip
parameters and issues an HTTP(S) request per page.
Views
A powerful feature of CouchDB are secondary indexes, known as views. They are created using a map function written most commonly in Javascript, and optionally a reduce part. For example, to create a view on the name
field, we use the following:
view_index(db, "my_ddoc", "my_view",
"""
function(doc) {
if(doc && doc.name) {
emit(doc.name, 1);
}
}""")
Dict{String,Any} with 3 entries:
"ok" => true
"rev" => "1-b950984b19bb1b8bb43513c9d5b235bc"
"id" => "_design/my_ddoc"
To read from this view, use the view_query
method:
view_query(db, "my_ddoc", "my_view"; keys=["davina", "billy"])
Dict{String,Any} with 3 entries:
"rows" => Any[Dict{String,Any}(Pair{String,Any}("key","davina"),Pair{St…
"offset" => 1
"total_rows" => 5
Cloudant has an interactive tool for trying out Mango Query which is a useful resource:
Using attachments
CouchDB can store files alongside documents as attachments. This can be a convenient feature for many applications, but it has drawbacks, especially in terms of performance. If you find that you need to store large (say greater than a couple of meg) binary attachments, you should probably consider a dedicated, separate file store and only use CouchDB for metadata.
To write an attachment, use put_attachment
, which expects an {id, rev}
tuple referencing and existing document in the database and the path to the file holding the attachment:
data = createdoc(db, Dict("item" => "screenshot"))
result = put_attachment(db, data["id"], data["rev"], "test.png", "image/png", "data/test.png")
In order to read the attachment, use get_attachment
, which returns an IO stream:
att = get_attachment(db, result["id"], "test.png"; rev=result["rev"])
open("data/fetched.png", "w") do f
write(f, att)
end
Geospatial queries
One of the fancier aspects of Cloudant is its geospatial capabilities, and Couchzilla provides access to this functionality. Using this it is possible to use Cloudant to answer questions such as "show me all documents that fall within a given radius of a given point". A full description of this capability is beyond the scope of this document, but Cloudant provides rich documentation on the subject.
In order to try out the geospatial stuff using Couchzilla, we first need some data. Cloudant provides an open database that you can replicate into your own account here. It's a database of the locations of reported crimes in the Boston area.
Let's connect Couchzilla to a replica of this database, and run through the examples from Cloudant's geospatial tutorial page. We can re-use the client from before:
geodb = connectdb(client, "crimes")
The database already contains the necessary geospatial indexes. Had this not been the case we could have indexed it using geo_index
.
So let's list the first 20 crimes within a radius of 10,000m of the Boston State House:
result = geo_query(geodb, "geodd", "geoidx";
lat = 42.357963,
lon = -71.063991,
radius = 10000.0,
limit = 200)
result["rows"]
200-element Array{Any,1}:
Dict{String,Any}(Pair{String,Any}("rev","1-caa129c6e0c9e7667cd401675859da2a"),Pair{String,Any}("id","79f14b64c57461584b152123e38fcf2b"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0666,42.3593]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-e7c7eb51c49d7e5fab38b33b19542106"),Pair{String,Any}("id","79f14b64c57461584b152123e38c548a"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0646,42.3612]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-de437f29d19bb55a495693fa40975962"),Pair{String,Any}("id","79f14b64c57461584b152123e38b22cc"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.06,42.3616]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-4c4650e64d0cc0bb01e32a0b5aca2802"),Pair{String,Any}("id","79f14b64c57461584b152123e3917804"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.06,42.3616]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-e557e2555201054b924f618299cb9b64"),Pair{String,Any}("id","79f14b64c57461584b152123e392e828"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.06,42.3616]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-86261a0030776d68d98f805afec21c94"),Pair{String,Any}("id","79f14b64c57461584b152123e38a779d"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0587,42.3594]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-0892e7f4eb551df2453e9a11b274e190"),Pair{String,Any}("id","79f14b64c57461584b152123e38d6b78"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0587,42.3594]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-4ce963293c1810c3fc8fe606e9345e8e"),Pair{String,Any}("id","79f14b64c57461584b152123e38ee226"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0587,42.3594]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-816e850ff5ec2249993675fd568b2e9c"),Pair{String,Any}("id","79f14b64c57461584b152123e3927629"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0587,42.3594]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-59e512ec186a17dc3e94a3f1d7c13392"),Pair{String,Any}("id","79f14b64c57461584b152123e392867d"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0587,42.3594]),Pair{String,Any}("type","Point"))))
⋮
Dict{String,Any}(Pair{String,Any}("rev","1-be45124918034417ce77adbd99d3d54f"),Pair{String,Any}("id","79f14b64c57461584b152123e38c8ead"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.1331,42.3634]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-e17545f877d7fc1442abe71557ec44c8"),Pair{String,Any}("id","79f14b64c57461584b152123e391c876"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.1073,42.3038]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-50e1dd9b9ad194f90a0fb4f9001d1b43"),Pair{String,Any}("id","79f14b64c57461584b152123e3929889"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0551,42.289]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-f8407a2467b8fea166aa451994de75da"),Pair{String,Any}("id","79f14b64c57461584b152123e38b682a"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0773,42.2896]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-459aadf6156187de8c11ecce3b5f1f28"),Pair{String,Any}("id","79f14b64c57461584b152123e38afe98"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0501,42.2897]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-1d1c012db58954c6d799646e0e009728"),Pair{String,Any}("id","79f14b64c57461584b152123e38b0d38"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0473,42.2902]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-21dea1eb417bff225b4932acbe983314"),Pair{String,Any}("id","79f14b64c57461584b152123e38c9b44"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.1097,42.3042]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-edd6492692311118baaa8cbb980ef1c5"),Pair{String,Any}("id","79f14b64c57461584b152123e38d51e7"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.1341,42.349]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-13144e283f47d611d62d9f11d94161be"),Pair{String,Any}("id","79f14b64c57461584b152123e39168d7"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.135,42.3504]),Pair{String,Any}("type","Point"))))
We can specify a polygon for the Commercial Street corridor, which should yield only two docs:
result = geo_query(geodb, "geodd", "geoidx";
g="POLYGON ((-71.0537124 42.3681995 0,-71.054399 42.3675178 0,-71.0522962 42.3667409 0,-71.051631 42.3659324 0,-71.051631 42.3621431 0,-71.0502148 42.3618577 0,-71.0505152 42.3660275 0,-71.0511589 42.3670263 0,-71.0537124 42.3681995 0))")
result["rows"]
2-element Array{Any,1}:
Dict{String,Any}(Pair{String,Any}("rev","1-f0551b24741f182c5944621f87f9ac76"),Pair{String,Any}("id","79f14b64c57461584b152123e38d6349"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.0511,42.3651]),Pair{String,Any}("type","Point"))))
Dict{String,Any}(Pair{String,Any}("rev","1-8a9f1673b2b15232bbbb956a7f3b5397"),Pair{String,Any}("id","79f14b64c57461584b152123e3924516"),Pair{String,Any}("geometry",Dict{String,Any}(Pair{String,Any}("coordinates",Any[-71.052,42.3667]),Pair{String,Any}("type","Point"))))
If you want to delete a database, simply call deletedb
:
deletedb(client, dbname)
Dict{String,Any} with 1 entry:
"ok" => true
Handling Cloudant's rate capping
Cloudant pushes most of its stuff to upstream to Apache CouchDB. However, not everything Cloudant does makes sense for CouchDB, and once such example is throughput throttling. Cloudant, currently only in its Bluemix guise, prices its service in terms of provisioned throughput capacity for lookups, writes and queries. This means that you purchase a certain max number of requests per second, bucketed by type. This is similar in spirit to how other purveyors of database services price their services (e.g. DynamoDB).
When you hit capacity, Cloudant will return an error, signified by the HTTP status code 429 (Too many requests
). This means that the request was not successful, and will need to be retried at a later stage. Couchzilla optionally gives you a way to deal with 429 errors:
retry_settings!(;enabled=true, max_retries=5, delay_ms=10)
This will enable the retrying of requests failed with a 429. This will try a request a maximum of 5 times, with a delay of 10 ms added cumulatively, plus a little bit of noise (randomly between 1 and 10 ms). This is a module-global setting, so will apply to all Client
s created within the same Julia
session.
You can retrieve the current settings using:
retry_settings()
Note that this behaviour is not enabled by default, and relying on it alone on a rate-capped cluster will only help with temporary transgressions – your own code must still handle the case where the max retries are exceeded.
Using Cloudant's API keys for auth
Cloudant has an auth system distinct from the CouchDB traditional style based on the _users
database. By using API keys you can grant and revoke a client application's access. API keys have roles attached to them, a combination of _admin
, _reader
, _writer
, _replicator
and _creator
. It's not quite as straight-forward as it may seem. _reader
grants read-only access. TODO
In order to use the API key system, you need two steps:
-
Create the key using
data = make_api_key(client::Client) 2. Assign key to a database, with the appropriate roles
current = get_permissions(db) result = set_permissions(db, current; key=data["key"], roles=["_reader", "_writer"]) 3. Create a new client connection using the new key
api_client = Client(data["key"], data["password"], host) 4. Create a database connection using the new client
api_db = connectdb(api_client, "dbname")
There is one gotcha here that you need to be aware of. API keys are created on a central Cloudant admin cluster, and then replicated back to the one you're using. This means that running through the four steps above may occasionally fail to authenticate (step 3) for a good few minutes whilst the update percolates through. It helps to treat API keys as something to be created up front, rather than on the fly.
Client
#
Couchzilla.Client
— Type.
type Client
url
cookies
Client(username::AbstractString, password::AbstractString, urlstr::AbstractString; auth=true) =
cookieauth!(new(URI(urlstr)), username, password, auth)
end
The Client type represents an authenticated connection to a remote CouchDB/Cloudant instance.
#
Couchzilla.connectdb
— Method.
db = connectdb(client::Client, database::AbstractString)
Return an immutable Database reference.
Subsequent database-level operations will operate on the chosen database. If you need to operate on a different database, you need to create a new Database reference. connectdb(...)
does not check that the chosen remote database exists.
#
Couchzilla.createdb
— Method.
db, created = createdb(client::Client, database::AbstractString)
Create a new database on the remote end called dbname
. Return an immutable Database reference to this newly created db, and a boolean which is true if a database was created, false if it already existed.
#
Couchzilla.dbinfo
— Method.
info = dbinfo(client::Client, name::AbstractString)
Return the meta data about the dbname
database.
#
Couchzilla.listdbs
— Method.
dblist = listdbs(client::Client)
Return a list of all databases under the authenticated user.
#
Couchzilla.deletedb
— Method.
result = deletedb(client::Client, name::AbstractString)
Delete the named database.
#
Couchzilla.cookieauth!
— Function.
cookieauth!(client::Client, username::AbstractString, password::AbstractString, auth::Bool=true)
Private. Hits the _session
endpoint to obtain a session cookie that is used to authenticate subsequent requests. If auth
is set to false, this does nothing.
Database
The Database type represents a client connection tied to a specific database name. This is immutable, meaning that if you need to talk to several databases you need to create one Database type for each.
#
Couchzilla.Database
— Type.
immutable Database
url
name
client
Database(client::Client, name::AbstractString) =
new(URI(client.url.scheme, client.url.host, client.url.port, "/$name"), name, client)
end
The Database immutable is a client connection tied to a specific remote DB. It is normally not created directly, but via a call to connectdb()
, or createdb()
.
Examples
# Connect to existing DB. Does not verify it exists.
db = connectdb(client; database="mydb")
# Create a new db if it doesn't exist, otherwise connect
db, created = createdb(client; database="mydb")
#
Couchzilla.bulkdocs
— Method.
result = bulkdocs(db::Database; data=[], options=Dict())
Raw _bulk_docs.
This is a function primarily intended for internal use, but can be used directly to create, update or delete documents in bulk, so as to save on the HTTP overhead.
#
Couchzilla.createdoc
— Method.
result = createdoc(db::Database, body::Dict)
Create one new document.
Note that this is implemented via the _bulk_docs
endpoint, rather than a POST
to the /{DB}
.
#
Couchzilla.createdoc
— Method.
result = createdoc(db::Database, data::AbstractArray
Bulk create a set of new documents via the CouchDB _bulk_docs
endpoint.
#
Couchzilla.readdoc
— Method.
result = readdoc(db::Database, id::AbstractString;
rev = "",
attachments = false,
att_encoding_info = false,
atts_since = [],
open_revs = [],
conflicts = false,
deleted_conflicts = false,
latest = false,
meta = false,
revs = false,
revs_info = false)
Fetch a document by id
.
For a description of the parameters, see reference below. To use the open_revs
parameter as all
, use
result = readdoc(db, id; open_revs=["all"])
#
Couchzilla.updatedoc
— Method.
result = updatedoc(db::Database; id::AbstractString=nothing, rev::AbstractString=nothing, body=Dict())
Update an existing document, creating a new revision.
Implemented via the _bulk_docs endpoint.
#
Couchzilla.deletedoc
— Method.
result = deletedoc(db::Database; id::AbstractString=nothing, rev::AbstractString=nothing)
Delete a document revision. Implemented via the _bulk_docs endpoint:
Views
#
Couchzilla.view_index
— Method.
result = view_index(db::Database, ddoc::AbstractString, name::AbstractString, map::AbstractString;
reduce::AbstractString = "")
Create a secondary index.
The map
is a string containing a map function in Javascript. Currently, make_view
can only create a single view per design document.
The optional reduce
parameter is a string containing either a custom Javascript reducer (best avoided for performance reasons) or the name of a built-in Erlang reducer, e.g. "_stats"
.
Examples
result = view_index(db, "my_ddoc", "my_view", "function(doc){if(doc&&doc.name){emit(doc.name,1);}}")
Returns
Returns a Dict(...)
from the CouchDB response, of the type
Dict(
"ok" => true,
"rev" => "1-b950984b19bb1b8bb43513c9d5b235bc",
"id" => "_design/my_ddoc"
)
#
Couchzilla.view_query
— Function.
result = view_query(db::Database, ddoc::AbstractString, name::AbstractString;
descending = false,
endkey = "",
include_docs = false,
conflicts = false,
inclusive_end = true,
group = false,
group_level = 0,
reduce = true,
key = "",
keys = [],
limit = 0,
skip = 0,
startkey = "")
Query a secondary index.
Examples
# Query the view for a known key subset
result = view_query(db, "my_ddoc", "my_view"; keys=["adam", "billy"])
Returns
Dict(
"rows" => [
Dict("key" => "adam", "id" => "591c02fa8b8ff14dd4c0553670cc059a", "value" => 1),
Dict("key" => "billy", "id" => "591c02fa8b8ff14dd4c0553670cc13c1", "value" => 1)
],
"offset" => 0,
"total_rows" => 7
)
#
Couchzilla.alldocs
— Function.
alldocs(db::Database;
descending = false,
endkey = "",
include_docs = false,
conflicts = false,
inclusive_end = true,
key = "",
keys = [],
limit = 0,
skip = 0,
startkey = "")
Return all documents in the database by the primary index.
The optional parameters are:
- descending true/false – lexicographical ordering of keys. Default false.
- endkey id – stop when
endkey
is reached. Optional. - startkey id – start at
startkey
. Optional. - include_docs true/false – return the document body. Default false.
- conflicts true/false – also return any conflicting revisions. Default false.
- inclusive_end true/false – if
endkey
is given, should this be included? Default true - key id – return only specific key. Optional.
- keys [id, id,...] – return only specific set of keys (will POST). Optional.
- limit int – return only max
limit
number of rows. Optional. - skip int – skip over the first
skip
number of rows. Default 0.
Mango/Cloudant Query
#
Couchzilla.Selector
— Type.
immutable Selector
dict::Dict{AbstractString, Any}
end
Immutable represention of a Mango Selector used to query a Mango index.
Usually created using the custom string literal q"..."
(see the @q_str
macro), but can be created directly from either the raw json string containing a Selector expression or a Julia Dict(...)
representing the same.
Examples
sel = q"name = bob"
sel = Selector("{"name":{"$eq":"bob"}}")
sel = Selector(Dict("name" => Dict("$eq" => "bob")))
sel = and([q"name = bob", q"age > 18"])
#
Couchzilla.Selector
— Method.
Selector()
The empty Selector.
#
Couchzilla.Selector
— Method.
Selector(raw_json::AbstractString)
Create a Selector from the raw json.
#
Base.isempty
— Function.
isempty(sel::Selector)
True if sel is the empty Selector.
#
Couchzilla.@q_str
— Macro.
q"....."
Custom string literal for a limited Selector definition DSL.
It takes the form:
field op data
where field
is a field name, op is one of
=, !=, <, <=, >, >=, in, !in, all
This allows you to write things like:
q"name = bob"
q"value < 5"
q"occupation in [fishmonger, pilot, welder]"
Note that the Selector DSL only covers a fraction of the full Selector syntax. It can be used with the boolean functions and()
, or()
etc to build up more complex Selectors, e.g.
sel = and([q"name = bob", q"age > 18"])
For more information on the actual Selector syntax, see link below.
#
Couchzilla.QueryResult
— Type.
type QueryResult
docs::Vector{Dict{AbstractString, Any}}
bookmark::AbstractString
end
Returned by query(...)
.
bookmark
is only relevant when querying indexes of type text
.
#
Couchzilla.mango_query
— Function.
result = mango_query{T<:AbstractString}(db::Database, selector::Selector;
fields::Vector{T} = Vector{AbstractString}(),
sort::Vector{Dict{T, Any}} = Vector{Dict{AbstractString, Any}}(),
limit = 0,
skip = 0,
bookmark = "")
Query database (Mango/Cloudant Query).
See the Selector
type and the associated q"..."
custom string literal which implements a simplified DSL for writing selectors.
Examples
Find all documents where "year" is greater than 2010, returning the fields _id
, _rev
, year
and title
, sorted in ascending order on year
. Set the page size to 10.
result = mango_query(db, q"year > 2010";
fields = ["_id", "_rev", "year", "title"],
sort = [Dict("year" => "asc")],
limit = 10)
Returns
type QueryResult
#
Couchzilla.paged_mango_query
— Function.
paged_mango_query{T<:AbstractString}(db::Database, selector::Selector;
fields::Vector{T} = Vector{AbstractString}(),
sort::Vector{Dict{T, Any}} = Vector{Dict{AbstractString, Any}}(),
pagesize = 100)
Perform multiple HTTP requests against a JSON-type index producing the intermediate results. This is a wrapper around query()
using the skip
and limit
parameters.
Examples
for page in @task paged_mango_query(db, q"data = ..."; pagesize=10)
for doc in page.docs
# ...
end
end
#
Couchzilla.mango_index
— Function.
result = mango_index{T<:AbstractString}(db::Database, fields::AbstractArray;
name::T = "",
ddoc::T = "",
selector = Selector(),
default_field = Dict{String, Any}("analyzer" => "standard", "enabled" => true))
Create a Mango index.
All kw
parameters are optional. The fields spec is mandatory for JSON-type indexes. For a text index, if you give an empty vector as the fields, it will index every field, which is occasionally convenient, but a significant performance drain. The index type will defaults to type "json"
and will be assumed to be "text"
if the data in the fields
array are Dict
s. Note that the text
index type is a Cloudant-only feature.
Examples
-
Make a text index (Cloudant only)
result = mango_index(db, [Dict("name"=>"lastname", "type"=>"string")]; ddoc="my-ddoc", default_field=Dict("analyzer" => "german", "enabled" => true)) * Make a json index
result = mango_index(db, ["data", "data2"])
Returns
mango_index()
returns a Dict(...)
version of the CouchDB response, of the type
Dict(
"name" => "e7d18f69aa0deaa1ffcdf8f705895b61515a6bf6",
"id" => "_design/e7d18f69aa0deaa1ffcdf8f705895b61515a6bf6",
"result" => "created"
)
#
Couchzilla.listindexes
— Method.
result = listindexes(db::Database)
List all existing indexes for the database. This includes views, mango and geo indexes in addition to the primary index.
Returns
listindexes()
returns a Dict(...)
version of the CouchDB response:
Dict(
"indexes" => [
Dict(
"name" => "_all_docs",
"def" => Dict(
"fields" => [Dict("_id" => "asc")]
),
"ddoc" => nothing,
"type" => "special"
),
Dict(
"ddoc" => "_design/cc79a71f562af7ef36deafe511fea9a857b05bcc",
"name" => "cc79a71f562af7ef36deafe511fea9a857b05bcc",
"type" => "text",
"def" => Dict(
"index_array_lengths" => true,
"fields" => [Dict("cust" => "string"), Dict("value" => "string")],
"default_field" => Dict(
"analyzer" => "standard",
"enabled" => true
),
"selector" => Dict(),
"default_analyzer" => "keyword"
)
),
# ...
]
)
#
Couchzilla.mango_deleteindex
— Method.
result = mango_deleteindex(db::Database; ddoc="", name="", indextype="")
Delete a query index given its ddoc, index name and index type.
Indextype is either "text" or "json".
Returns
mango_deleteindex()
returns a Dict(...)
version of the CouchDB response:
Dict("ok" => true)
Attachments
You can attach files to documents in CouchDB. This can occasionally be convenient, but using attachments has performance implications, especially when combined with replication. See Cloudant's docs on the subject.
#
Couchzilla.put_attachment
— Method.
put_attachment(db::Database,
id::AbstractString,
rev::AbstractString,
name::AbstractString,
mimetype::AbstractString,
file::AbstractString)
Write an attachment to an existing document. Attachment read from a file.
Examples
doc = createdoc(db, Dict("item" => "screenshot"))
result = put_attachment(db, doc["id"], doc["rev"], "test.png", "image/png", "data/test.png")
#
Couchzilla.get_attachment
— Method.
result = get_attachment(db::Database, id::AbstractString, name::AbstractString; rev::AbstractString = "")
Read an attachment.
Examples
att = get_attachment(db, id, "test.png"; rev=rev)
open("data/fetched.png", "w") do f
write(f, att)
end
#
Couchzilla.delete_attachment
— Method.
result = delete_attachment(db::Database, id::AbstractString, rev::AbstractString, name::AbstractString)
Delete an attachment.
Examples
result = delete_attachment(db, id, rev, "test.png")
Replication
Unlike e.g. PouchDB, CDTDatastore and sync-android, Couchzilla
is not a replication library in that it does not implement a local data store. However, you have access to all replication-related endpoints provided by CouchDB. The CouchDB replication algorithm is largely undocumented, but a good write-up can be found in Couchbase's repo.
#
Couchzilla.changes
— Function.
changes(db::Database;
doc_ids = [],
conflicts = false,
descending = false,
include_docs = false,
attachments = false,
att_encoding_info = false,
limit = 0,
since = 0)
Query the CouchDB changes feed, returned as a big Dict
. Normal (batch) mode only - for streaming, see changes_streaming()
.
Note that the CouchDB parameter last-event-id
is not supported. Use since
to achieve the same thing.
Examples
results = changes(db; include_docs=true, since=0)
filtered = changes(db; doc_ids=["25806e48920b4a35b3c9d9f23c16c821", "644464774951c32fad7243ac8c9745ad"])
#
Couchzilla.changes_streaming
— Function.
changes_streaming(db::Database;
doc_ids = [],
conflicts = false,
descending = false,
include_docs = false,
attachments = false,
att_encoding_info = false,
limit = 0,
since = 0)
Query the CouchDB changes feed, line by line. This is a co-routine. Note that the last item produced will always be the CouchDB last_seq
entry.
This is a co-routine. Note that the CouchDB parameter last-event-id
is not supported. Use since
to achieve the same thing.
Examples
for ch in @task changes_streaming(db, limit=1)
println(ch)
end
Dict(
"seq"=>"1-g1...gm-",
"changes"=>[Dict("rev"=>"1-24213171b98945a2ed3578c926eb3651")],
"id"=>"37f11227ef384458b01e4afc7eed7194"
)
Dict(
"pending"=>213,
"last_seq"=>"1-g1...gm-"
)
#
Couchzilla.revs_diff
— Function.
revs_diff{T<:AbstractString}(db::Database; data::Dict{T, Vector{T}} = Dict())
revs_diff
is a component of the CouchDB replication algorithm.
Given set of ids and revs, it will return a potentially empty subset of ids and revs from this list which the remote end doesn't have.
Dict(
"190f721ca3411be7aa9477db5f948bbb" => [
"3-bb72a7682290f94a985f7afac8b27137",
"4-10265e5a26d807a3cfa459cf1a82ef2e",
"5-067a00dff5e02add41819138abb3284d"
]
)
Returns
The returned structure is a Dict
where the keys are the id
s of any documents where missing rev
s are found. An example:
Dict(
"e1132d11a43933948cb46c5e72e13659" => Dict(
"missing" => ["2-1f0e2f0d841ba6b7e3d735b870ebeb8c"],
"possible_ancestors" => ["1-efda16b0115e5fcf2cfd065faee674fc"]
)
)
#
Couchzilla.bulk_get
— Function.
bulk_get{T<:AbstractString}(db::Database; data::Vector{Dict{T, T}} = [])
bulk_get
is used as part of an optimisation of the CouchDB replication algorithm in recent versions, allowing the replicator to request many documents with full ancestral information in a singe HTTP request.
It is supported in CouchDB >= 2.0 (Cloudant "DBNext"), and also suported by PouchDB.
The data
parameter is a list of Dict
s with keys id
and rev
.
Examples
result = revs_diff(db; data = [
Dict(
"id" => "f6b40e2fdc017e7e4ec4fa88ae3a4950",
"rev" => "2-1f0e2f0d841ba6b7e3d735b870ebeb8c"
),
Dict(
"id" => "2f8b7921cbcfde79fb2ff8079cada273",
"rev" => "1-6c3ef2ba29b6631a01ce00f80b5b4ad3"
)
])
Returns
The response format is convoluted, and seemingly undocumented for both CouchDB and Cloudant at the time of writing.
"results": [
{
"id": "1c43dd76fee5036c0cb360648301a710",
"docs": [
{
"ok": { ..doc body here...
}
}
}
]
},
Geospatial
#
Couchzilla.geo_index
— Function.
result = geo_index(db::Database, ddoc::AbstractString, name::AbstractString, index::AbstractString)
Create a geospatial index.
The index
parameter is a string containing an index function in Javascript.
Examples
result = geo_index(db, "geodd", "geoidx",
"function(doc){if(doc.geometry&&doc.geometry.coordinates){st_index(doc.geometry);}}"
)
Returns
Returns a Dict(...)
from the CouchDB response, of the type
Dict(
"ok" => true,
"rev" => "1-b950984b19bb1b8bb43513c9d5b235bc",
"id" => "_design/geodd"
)
#
Couchzilla.geo_indexinfo
— Function.
result = geo_indexinfo(db::Database, ddoc::AbstractString, name::AbstractString)
Retrieve stats for a geospatial index.
Examples
result = geo_indexinfo(db, "geodd", "geoidx")
Returns
Returns a Dict(...)
from the CouchDB response, of the type
Dict(
"name" => "_design/geodd/geoidx",
"geo_index" => Dict(
"doc_count" => 269,
"disk_size" => 33416,
"data_size" => 26974
)
)
#
Couchzilla.geo_query
— Function.
geo_query(db::Database, ddoc::AbstractString, name::AbstractString;
lat::Float64 = -360.0,
lon::Float64 = -360.0,
rangex::Float64 = 0.0,
rangey::Float64 = 0.0,
radius::Float64 = 0.0,
bbox::Vector{Float64} = Vector{Float64}(),
relation::AbstractString = "intersects",
nearest = false,
bookmark::AbstractString = "",
format::AbstractString = "view",
skip = 0,
limit = 0,
stale = false,
g::AbstractString = "")
Query a geospatial index. This quickly becomes complicated. See the references below.
The "g" parameter is a string representing a Well Known Text
object (WKT
). It can be used to describe various geometries, such as lines and polygons. Currently supported geometric objects are
- point
- linestring
- polygon
- multipoint
- multilinestring
- multipolygon
- geometrycollection
Geo queries can be configured to return its results in a number of different formats using the format
parameter. The accepted values are:
- legacy
- geojson
- view (default)
- application/vnd.geo+json
The relation
parameter follows the DE-9IM spec for geometric relationships. Acceptable values are:
- contains
- contains_properly
- covered_by
- covers
- crosses
- disjoint
- intersects (default)
- overlaps
- touches
- within
Examples
Radial query
result = geo_query(geodb, "geodd", "geoidx";
lat = 42.357963,
lon = -71.063991,
radius = 10000.0,
limit = 200)
Polygon query
result = geo_query(geodb, "geodd", "geoidx";
g="POLYGON ((-71.0537124 42.3681995 0,-71.054399 42.3675178 0,-71.0522962 42.3667409 0,-71.051631 42.3659324 0,-71.051631 42.3621431 0,-71.0502148 42.3618577 0,-71.0505152 42.3660275 0,-71.0511589 42.3670263 0,-71.0537124 42.3681995 0))")
Auth
#
Couchzilla.get_permissions
— Function.
data = get_permissions(db::Database)
Fetch all current permissions. Note: this is Cloudant-specific.
#
Couchzilla.set_permissions
— Function.
result = set_permissions(db::Database, current::Dict=Dict{AbstractString, Any}(); key="", roles=[])
Modify permissions. Note: this is Cloudant-specific.
#
Couchzilla.make_api_key
— Function.
data = make_api_key(client::Client)
Generate a new API key. Note: this is Cloudant-specific.
Note also that API keys take a long time to propagate around a cluster. It's unsafe to rely on a newly created key to be immediately available. The reason for this is that Cloudant keeps its auth-related documents centrally, and replicate out to all clusters.
#
Couchzilla.delete_api_key
— Function.
result = delete_api_key(db::Database, key::AbstractString)
Remove an existing API key. Note: this is Cloudant-specific. This is implemented via set_permissions()
.
Utility stuff
#
Couchzilla.retry_settings!
— Method.
retry_settings!(;enabled=false, max_retries=5, delay_ms=10)
Set parameters for retrying requests failed with a 429: Too Many Requests. This is Cloudant-specific, but safe to leave enabled if using CouchDB, as the error will never be encountered.
Failed requests are retried after a growing interval according to
sleep((tries * delay_ms + rand(1:10))/1000.0)
until tries
exceed max_retries
or the request succeeds.
Note: it is not sufficient to rely on this behaviour on a rate-limited Cloudant cluster, as persistently hitting the limits can only be fixed by moving to higher reserved throughput capacity. For this reason this is disabled by default.
#
Couchzilla.retry_settings
— Function.
retry_settings()
Return the current retry settings.
#
Couchzilla.relax
— Function.
relax(fun, url_string; cookies=nothing, query=Dict(), headers=Dict())
Makes an HTTP request with the relevant cookies and query strings and deserialises the response, assumed to be json.
Cloudant implements request throttling based on reerved throughput capacity. Hitting a capacity limit will return a 429 error (Too many requests). This is Cloudant-specific.
This function can retry on 429 if this behaviour is enabled. See retry_settings()
.
#
Couchzilla.endpoint
— Function.
endpoint(uri::URI, path::AbstractString)
Appends a path string to the URI, returning as a string.