Sort with indexes

FaunaDB’s sorting is accomplished using indexes, specifically by ordering of an index’s values. The values field in an index defines one or more fields that are returned in the result set, and the fields can be in ascending or descending order.

This tutorial demonstrates how to achieve various sorting goals using indexes:

This tutorial assumes that you have successfully prepared your database by creating the necessary collections and documents.

Sort a single field

To sort a single field, we need to construct an appropriate index. For this example, let’s sort the documents in the People collection by the first field. Copy the following query, paste it into the Shell, and run it:

CreateIndex({
  name: "people_sort_by_first_asc",
  source: Collection("People"),
  values: [
    { field: ["data", "first"] },
    { field: ["ref"] }
  ]
})

The points of interest for this query:

  • It is a good practice to name an index after its collection, its purpose, which field(s) are involved in the purpose, and the sort direction.

  • We specify two fields for values:

    1. The first field, which exists in the document’s data field.

    2. The document’s ref, so that we can easily retrieve the document.

When you run the query, the result should be similar to:

{
  "ref": Index("people_sort_by_first_asc"),
  "ts": 1565314335810000,
  "active": true,
  "serialized": true,
  "name": "people_sort_by_first_asc",
  "source": Collection("People"),
  "values": [
    {
      "field": [
        "data",
        "first"
      ]
    },
    {
      "field": [
        "ref"
      ]
    }
  ],
  "partitions": 8
}

FaunaDB indexes are initially populated by a background task, that may take anywhere from a few seconds to several minutes to complete, largely due to the number of documents in the source collection. If you execute queries using a freshly-created index and you receive no results, likely the index is still being built. Try again in a few seconds.

Note that when you create an index, any documents created afterwards that involve that index would always show up.

Before we see the sorted results, let’s see the order we get when using using the collection index (created when we prepared our tutorial data). Copy the following query, paste it into the Shell, and run it:

Map(
  Paginate(Match(Index("all_people"))),
  Lambda(
    "X",
    Select(["data", "first"], Get(Var("X")))
  )
)

The points of interest for this query are:

  • We use the Map function to process each of the Paginate results with a Lambda function.

  • We use the variable named X, just to provide a variable name via Var. You can use any variable name you like.

  • We only need one variable for the Lambda function, because our index only returns one value for each document in the collection: the document’s Ref. If our index specified a values field, we would need to use one variable per field defined by the values field.

  • The Lambda function calls the Get function to retrieve the document specified by the X variable. To avoid returning everything in each document, the Select function is used to extract just the first field from each document.

When you run the query, the output should be similar to:

{
  "data": [
    "Alan",
    "Tim",
    "Leslie",
    "Marvin",
    "Alan",
    "Grace",
    "Stephen"
  ]
}

The order you see depends on the order of each document’s ref during the creation query.

Now, let’s use our new index to see the names in order. Copy the following query, paste it into the Shell, and run it:

Paginate(Match(Index("people_sort_by_first_asc")))

There was no need to use Map this time: our index contains the value of the first field, so there is no need to use the Get function to retrieve the document. This also reduces the number of read operations, which could save you some money.

When you run this query, the output should be similar to:

{
  "data": [
    [
      "Alan",
      Ref(Collection("People"), "240166254282801673")
    ],
    [
      "Alan",
      Ref(Collection("People"), "240166254282805769")
    ],
    [
      "Grace",
      Ref(Collection("People"), "240166254282806793")
    ],
    [
      "Leslie",
      Ref(Collection("People"), "240166254282803721")
    ],
    [
      "Marvin",
      Ref(Collection("People"), "240166254282804745")
    ],
    [
      "Stephen",
      Ref(Collection("People"), "240166254282807817")
    ],
    [
      "Tim",
      Ref(Collection("People"), "240166254282802697")
    ]
  ]
}

What if we want to show the list in reverse order? For that, we need to construct another index. Copy the following query, paste it into the Shell, and run it:

CreateIndex({
  name: "people_sort_by_first_desc",
  source: Collection("People"),
  values: [
    { field: ["data", "first"], reverse: true },
    { field: ["ref"] }
  ]
})

This query creates an index almost identical to the first index that we created. The only two differences are:

  1. The index name indicates "desc"ending order.

  2. The first field in values sets reverse to true. This causes the index to sort by the first field in descending order.

After we wait for the index to populate (anywhere from a few seconds to a few minutes), we can run a revised version of our "sort" query. Copy the following query, paste it into the shell, and run it:

Paginate(Match(Index("people_sort_by_first_desc")))

When you run this query, the result should be similar to:

{
  "data": [
    [
      "Tim",
      Ref(Collection("People"), "240166254282802697")
    ],
    [
      "Stephen",
      Ref(Collection("People"), "240166254282807817")
    ],
    [
      "Marvin",
      Ref(Collection("People"), "240166254282804745")
    ],
    [
      "Leslie",
      Ref(Collection("People"), "240166254282803721")
    ],
    [
      "Grace",
      Ref(Collection("People"), "240166254282806793")
    ],
    [
      "Alan",
      Ref(Collection("People"), "240166254282801673")
    ],
    [
      "Alan",
      Ref(Collection("People"), "240166254282805769")
    ]
  ]
}

Sort multiple fields

To sort on multiple fields, there are a couple of approaches. In this section, we demonstrate using a single index to sort multiple fields.

For this demonstration, we’ll sort by last and then first. To do so, we need to create an appropriate index. Copy the following query, paste it into the Shell, and run it:

CreateIndex({
  name: "people_sort_by_last_first_asc",
  source: Collection("People"),
  values: [
    { field: ["data", "last"] },
    { field: ["data", "first"] },
    { field: ["ref"] }
  ]
})

Then, after we wait for the index to populate (anywhere from a few seconds to a few minutes), we can see the results. Copy the following query, paste it into the Shell, and run it:

Paginate(Match(Index("people_sort_by_last_first_asc")))

When you run this query, the result should be similar to:

{
  "data": [
    [
      "Cook",
      "Stephen",
      Ref(Collection("People"), "240166254282807817")
    ],
    [
      "Cook",
      "Tim",
      Ref(Collection("People"), "240166254282802697")
    ],
    [
      "Hopper",
      "Grace",
      Ref(Collection("People"), "240166254282806793")
    ],
    [
      "Lamport",
      "Leslie",
      Ref(Collection("People"), "240166254282803721")
    ],
    [
      "Minsky",
      "Marvin",
      Ref(Collection("People"), "240166254282804745")
    ],
    [
      "Perlis",
      "Alan",
      Ref(Collection("People"), "240166254282805769")
    ],
    [
      "Turing",
      "Alan",
      Ref(Collection("People"), "240166254282801673")
    ]
  ]
}

What if we want the first name to be in descending order? We need another index. Copy the following query, paste it into the Shell, and run it:

CreateIndex({
  name: "people_sort_by_last_asc_first_desc",
  source: Collection("People"),
  values: [
    { field: ["data", "last"] },
    { field: ["data", "first"], reverse: true },
    { field: ["ref"] }
  ]
})

Then, after we wait for the index to populate (anywhere from a few seconds to a few minutes), we can see the results. Copy the following query, paste it into the Shell, and run it:

Paginate(Match(Index("people_sort_by_last_asc_first_desc")))

When you run this query, the result should be similar to:

{
  "data": [
    [
      "Cook",
      "Tim",
      Ref(Collection("People"), "240166254282802697")
    ],
    [
      "Cook",
      "Stephen",
      Ref(Collection("People"), "240166254282807817")
    ],
    [
      "Hopper",
      "Grace",
      Ref(Collection("People"), "240166254282806793")
    ],
    [
      "Lamport",
      "Leslie",
      Ref(Collection("People"), "240166254282803721")
    ],
    [
      "Minsky",
      "Marvin",
      Ref(Collection("People"), "240166254282804745")
    ],
    [
      "Perlis",
      "Alan",
      Ref(Collection("People"), "240166254282805769")
    ],
    [
      "Turing",
      "Alan",
      Ref(Collection("People"), "240166254282801673")
    ]
  ]
}

The only notable difference is that, this time, the "Tim Cook" document appears before the "Stephen Cook" document.

Sort considerations

So far, we’ve looked at the "happy path" for sorting. Our indexes have been fully populated, all documents have had consistent values in the indexes fields, and the results have returned as we expected.

What happens when we’re not on the "happy path"? When we created our Letters collection, the extra field was notably less consistent than the other fields.

Let’s create an index to sort on the extra field and see what sort of results are achieved. Copy the following query, paste it into the Shell, and run it:

CreateIndex({
  name: "letters_sort_by_extra_asc",
  source: Collection("Letters"),
  values: [
    { field: ["data", "extra"] },
    { field: ["ref"] }
  ]
})

Then, after we wait for the index to populate (anywhere from a few seconds to a few minutes), we can see the results. Copy the following query, paste it into the Shell, and run it:

Map(
  Paginate(Match(Index("letters_sort_by_extra_asc"))),
  Lambda(
    ["extra", "ref"],
    Var("extra")
  )
)

The points of interest for this query:

  • Our letters_sort_by_extra_asc index contains two fields in values, so our Lambda function must accept two variables.

  • Since our index contains the extra field in values, we don’t have to fetch the document to return that field. So, we don’t need to call the Get function, we can just return the value of the extra variable with the Var function.

When you run this query, the output should be similar to:

  "data": [
    11,
    24,
    "",
    "14th",
    "16th",
    "18th",
    "19th",
    "20th",
    "21st",
    "22nd",
    "24 + 1",
    "4th",
    "9th",
    "eighth",
    "fifteenth",
    "fifth",
    "First",
    "second",
    "seventeenth",
    "seventh",
    "sixth",
    "tenth",
    "third",
    "twenty-third",
    null,
    null
  ]
}

When FaunaDB orders values in indexes, it organizes the output by type and then sorts values of the same type. The type order is always the same:

  1. Nulls

  2. Numbers

  3. Booleans

  4. Dates

  5. Times

  6. Refs

  7. Objects

  8. Arrays

  9. Strings

Nulls always appear last in the output (or first, if the field containing Nulls has reversed: true set).

So, our output contains Numbers first, then Strings, then Nulls.

Conclusion

This tutorial has demonstrated how to sort documents, by single and multiple fields, using indexes. We’ve also seen how variation in field values can affect sort results.

While it may be a bit more work than you might expect, especially if you are familiar with SQL sorting, FaunaDB’s sorting can be comparable provided that you create all of the indexes required for your sorting situations.

Was this article helpful?

We're sorry to hear that.
Tell us how we can improve! documentation@fauna.com

Thank you for your feedback!