redis-tagging

Fast and simple tagging of (sorted) items

Redis-Tagging

Fast and simple tagging of (sorted) items.

  • Maintains the order of tagged items with the help of Redis Sorted Sets.
  • Unions and intersections on tags while also maintaining the order.
  • Fast and efficient paging over thousands of results with support of limit, offset.
  • Namespaces to keep multiple "buckets" of tags on the same server.
  • Counters for each tag in a namespace.
  • REST interface via REST Tagging
  • Test coverage

Tagging and efficient querying of items with unions and intersections is no fun with traditional databases.

Imagine a SQL database with concerts that need to be output ordered by date. Each item is tagged with tags like chicago, rock, stadium, open-air. Now let's try to get the following items:

  • 10 concerts (ordered by date) in chicago (limit=10, tags=["chicago"]) and the total amount of concerts in chicago.
  • The next 10 concerts, skipping the first 10, (limit=10, tags=["chicago"], offset=10) and the total amount.
  • 40 concerts in detroit, chicago or cleveland (limit=40, tags=["detroit", "chicago", "cleveland"], type="union") and the total amount.
  • 50 concerts that are rock and in a stadium (limit=50, tags=["rock", "stadium"]) and the total amount.
  • The top 20 tags used and the amount of items tagged with each.

Those queries together with the maintenance of tables and indexes can be a pain with SQL. Enter Redis and its fast in-memory set opererations.

Here is how Redis-Tagging will make the tagging of items in external databases fast and easy:

  • When storing an item in your database you still store the tags but in a normal string field for reference and easy output. You will no longer use this field in a WHERE-statement. No additional tables for tags and tag associations are needed.
  • You post the id, a score (for sorting) and the list of tags to Redis-Tagging whenever you add, update or delete an item. The score could be a date timestamp or any other number you use for sorting.
  • Redis-Tagging will output all results (e.g. all items with tags chicago and rock) as a list of IDs ordered correctly by the score you supplied.
  • You use this list of IDs to get the actual items from your database.

So with little changes you will end up with a lot less code, tables and need to maintain a complex structure just to support fast tagging.

If you want to use the REST interface to access Redis Tagging from a non Node.js application please have a look at: REST Tagging

npm i redis-tagging --save

Parameters for RedisTagging:

  • host (String): optional (Default: "127.0.0.1") The Redis server
  • port (Number): optional (Default: 6379) The Redis port
  • options, optional Default: {}. Additional options. See: https://github.com/mranney/node_redis#rediscreateclientport-host-options
  • nsprefix (String): optional (Default: "rt") The namespace prefix used for all keys created by Redis Tagging
  • client: optional An external RedisClient object which will be used for the connection.
var RedisTagging = require("redis-tagging");
var rt = new RedisTagging({host: "127.0.0.1", port: 6379, nsprefix: "rt"} );

Important: Redis-Tagging works with items from your database (whatever you might use). Its purpose is to make tag based lookups fast and easy.
A typical item in your database should include an id (the primary key) and a list of tags for this items. You could store this as a JSON string (e.g. ["car", "bmw", "suv", "x5"].
You'll want to try to keep your db in sync with the item ids stored in Redis-Tagging.

Go through the following examples to see what Redis-Tagging can do for you:

This will create an item with the id itm123.
Note: There is no partial update of tags for an item. You always write the full list of tags.

rt.set(
{
bucket: "concerts",
id: "itm123",
tags: ["new york", "stadium", "rock", "open-air"],
score: 1356341337
},
function (errresp) {
if (resp === true) {
// item was saved 
}
}
);

Returns all tags for an item id.

Note: This method is usually not needed if you store the tags for each item in your database.

rt.get(
{
bucket: "concerts",
id: "itm123"
},
function (errresp) {
// resp countains an array of all tags 
// For the above set example resp will contain: 
// ["new york", "stadium", "rock", "open-air"] 
}
);

Note: This is the same as using set with an empty array of tags.

rt.remove(
{
bucket: "concerts",
id: "itm123"
},
function (errresp) {
if (resp === true) {
// item was removed 
}
}
);
rt.allids(
{
bucket: "concerts"
}
,
function (errresp) {
// resp countains an array of all ids 
}
);

The main method. Return the IDs for one or more tags. When more than one tag is supplied the query can be an intersection (default) or a union. type=inter (default) only those IDs will be returned where all tags match. type=union all IDs where any tag matches will be returned.

Parameters object:

  • bucket (String)
  • tags (Array) One or more tags
  • limit (Number) optional Default=100 (0 will return 0 items but will return the total_items!)
  • offset (Number) optional Default=0
  • withscores (Number) optional Default=0 Set this to 1 to output the scores
  • order (String) optional Default ="desc"
  • type (String) optional "inter", "union" Default: "inter"
rt.tags(
{
bucket: "concerts",
tags: ["berlin", "rock"],
limit: 2,
offset: 4
},
function (errresp) {
// resp contains: 
//  {"total_items":108, 
//  "items":["8167","25652"], 
//  "limit":2, 
//  "offset":4} 
}
);

The returned data is item no. 5 and 6. The first 4 got skipped (offset=4). You can now do a

SELECT * FROM Concerts WHERE ID IN (8167,25652) ORDER BY Timestamp DESC

Return the top n tags of a bucket.

rt.toptags(
{
bucket: "concerts",
amount: 3
},
function (errresp) {
// resp contains: 
//  { 
// "total_items": 18374, 
//   "items":[ 
// {"tag":"rock", "count":1720}, 
// {"tag":"pop", "count":1585}, 
// {"tag":"New York", "count":720} 
// ] 
// } 
}
);

List all buckets with at least one item stored in Redis.

Important: This method uses the Redis keys command. Use with care.

rt.buckets(
function (errresp) {
// resp contains an array with all buckets 
}
);

Removes a single bucket and all items

rt.removebucket(
{
bucket: "concerts"
},
function (errresp) {
if (resp === true) {
// bucket was removed 
}
}
);
  • Make sure your DB has the following fields for the items you want to tag (names don't need to match exactly):
    • id: A primary key to quickly find your item.
    • score: Any number you use to sort your data. This is usually a date. If you saved a date in date-format you need to convert it to a numeric timestamp.
    • tags: A list of tags for this item. It is up to you how you store this. Usually a normal string field is sufficient.
  • Do a set for each item to populate the Redis-Tagging data.
  • When you insert / update / delete items in your DB make sure you also tell Redis-Tagging about it.
  • Now use the methods described above to make intersections and get the IDs back.
  • Use the IDs to get the actual records from your DB and display them as usual.
  • Enjoy.

See https://github.com/smrchy/redis-tagging/blob/master/CHANGELOG.md

Name Description
node-cache Simple and fast Node.js internal caching. Node internal in memory cache like memcached.
rsmq A lightweight message queue for Node.js that requires no dedicated queue server. Just a Redis server.
redis-sessions An advanced session store for Node.js and Redis
rsmq-worker Helper to implement a worker based on RSMQ (Redis Simple Message Queue).
redis-notifications A Redis based notification engine. It implements the rsmq-worker to safely create notifications and recurring reports.
task-queue-worker A powerful tool for background processing of tasks that are run by making standard http requests.
obj-schema Simple module to validate an object by a predefined schema
connect-redis-sessions A connect or express middleware to use redis sessions that lets you handle multiple sessions per user_id.
systemhealth Node module to run simple custom checks for your machine or it's connections. It will use redis-heartbeat to send the current state to Redis.
soyer Soyer is small lib for serverside use of Google Closure Templates with node.js.
grunt-soy-compile Compile Goggle Closure Templates (SOY) templates including the handling of XLIFF language files.
backlunr A solution to bring Backbone Collections together with the browser fulltext search engine Lunr.js

Please see the LICENSE.md file.