Full-Text Searching with Lunr.js

December 21, 2019

Lunr.js is a full-text search library for JavaScript. It allows us to perform a complex search to a collection of data. Its small, powerful, and most importantly, easy to use.

Full-text search is an advanced technique to search a database. This technique is usually used to quickly find documents or records based on a keyword. It also allows us to rank which document is more relevant than the others using a scoring system.

This library is an alternative of Solr, which is another full-text search library that supports multiple language. But Lunr.js claims itself as a smaller and simpler alternative of Solr. I also personally recommends you to use Lunr.js if you are building JavaScript apps.

It is very hard to implement full-text search by yourself, especially in the browser. But fortunately, Lunr.js makes us very easy to do such thing.

Getting Started

First, we need to initialize a Node.js project. You can also use your front-end projects such as React or Angular apps, since Lunr.js is compatible for both platform.

$ mkdir learn-lunr
$ cd learn-lunr
$ npm init -y

Then, we obviously need to install the lunr package with npm.

$ npm install lunr

I have published the source code of this entire project on my GitHub. Check it out here, or clone it into your computer by running this command in your terminal.

$ git clone https://github.com/rahmanfadhil/learn-lunr.git

Prepare the Data

Let's say we have a variable that contains a list of blog posts.

index.js
const posts = [
	{
		id: "1",
		title: "What is JavaScript?",
		description:
			"JavaScript is a high-level, object-oriented programming language based on the ECMAScript specification.",
	},
	{
		id: "2",
		title: "What is Java?",
		description:
			"Java is a cross-platform object-oriented programming language which at first developed by the Sun Microsystems.",
	},
	{
		id: "3",
		title: "What is React?",
		description:
			"React is a popular JavaScript library which heavily used to build single-page applications.",
	},
]

Here, we have a collection of fake blog posts with a title and the description. Every single document should have a unique identifier in the id property, followed by the other metadata to provide the information to help the users find what they want.

Now, we need to create a Lunr.js instance.

index.js
const lunr = require("lunr")

const posts = [
	// ...
]

const idx = lunr(function () {
	this.field("title")
	this.field("description")

	for (let i = 0; i < posts.length; i++) {
		this.add(posts[i])
	}
})

Here, we're initializing the Lunr.js instance in the idx variable by using the lunr function that we imported from lunr package. The lunr function gets a callback to define our fields we want to query using the this.field, which is this case is title and description.

We also want to loop through the posts array and register every single post using the add method.

One thing to keep in mind is that we need to use the function keyword for the callback instead of using an arrow function. Unless, we can't access the this variable that contains functions to register our fields.

Search the Data

After we setup our instance, searching those data is very easy. Use the search method and pass the keyword, just like Google Search.

const result = idx.search("java")
console.log(result)

Result:

[
  {
    ref: '2',
    score: 0.9279999999999999,
    matchData: { metadata: [Object: null prototype] }
  }
]

If we search the posts that related to "java", we will obviously get the second post. But here, we only get the ref, as well as the score. The ref contains the unique id of that practicular post, and the score tells us how close it matches the search query.

Now, when we try to search "object oriented language". Here is what we got.

const result = idx.search("object oriented language")
console.log(result)

Result:

[
  {
    ref: '1',
    score: 0.8244561844027856,
    matchData: { metadata: [Object: null prototype] }
  },
  {
    ref: '2',
    score: 0.793279269866546,
    matchData: { metadata: [Object: null prototype] }
  }
]

We will get the the first and the second post. Because those are the posts that mention "object-oriented programming language".

Get the Posts

Now, we have successfully search our documents, but the only thing we got is the unique id of the posts that matches the query. Using those ids, we can get the posts by using this function below.

function searchPosts(query) {
	const result = idx.search(query)

	return result.map((item) => {
		return posts.find((post) => item.ref === post.id)
	})
}

Now, if we search by using the searchPosts function, here is what we got.

const result = searchPosts("object oriented language")

Result:

[
  {
    id: '1',
    title: 'What is JavaScript?',
    description: 'JavaScript is a high-level, object-oriented programming language based on the ECMAScript specification.'
  },
  {
    id: '2',
    title: 'What is Java?',
    description: 'Java is a cross-platform object-oriented programming language which at first developed by the Sun Microsystems.'
  }
]
Profile picture

Abdurrahman Fadhil

I'm a software engineer specialized in iOS and full-stack web development. If you have a project in mind, feel free to contact me and start the conversation.