Skip to main content

Term-level queries

Term-level queries allow you to find results based on precise values in structured data.1 For example, by asset type, status, or GUID.

Unlike full-text queries, the search input you use in a term-level query is not analyzed. This means what you search for is matched exactly against what's stored in an attribute—no fuzzy-matching is applied.2

Details

Below are the various kinds of term-level queries. These are sorted with the most commonly used at the top, and cover their usual usage. Each one is linked to Elasticsearch's own documentation to provide greater details. (In most cases there are many more options for each kind of query than what's documented here.)

You will often combine these queries to create more complex criteria.

Term

Term queries return results where the asset's value for that attribute matches exactly what you're searching.

What if I want it to be a case insensitive match?

You can still use term queries for case insensitive matching, too.

  • ** Java**: add a second parameter of true to the predicate method
  • ** Python**: add a named parameter of case_insensitive=True to the predicate method
  • ** Raw REST API**: send through "case_insensitive": true to the API directly
Build the query and request
IndexSearchRequest index = client.assets.select() // (1)
.where(Asset.NAME.eq("some-name", true)) // (2)
.toRequest();
  1. You can search across all assets using the select() method of the assets member on any client.

  2. Chain a where() onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the NAME of an Asset. Adding the eq() predicate creates a term query. You can also optionally send a second parameter as true to do a case-insensitive match.

    Equivalent query through Elastic
    Query byTerm =  TermQuery.of(t -> t
    .field("name.keyword")
    .value("some-name")
    .caseInsensitive(true))
    ._toQuery();

Terms

Terms queries return results where the asset's value for that attribute matches one or more of the values you're searching exactly.

Build the query and request
IndexSearchRequest index = client.assets.select() // (1)
.where(Asset.TYPE_NAME.in(Set.of(Table.TYPE_NAME, Column.TYPE_NAME))) // (2)
.toRequest();
  1. You can search across all assets using the select() method of the assets member on any client.

  2. Chain a where() onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the TYPE_NAME of an asset. Adding the in() predicate creates a terms query.

    Equivalent query through Elastic
    Query byType = TermsQuery.of(t -> t
    .field("__typeName.keyword")
    .terms(TermsQueryField.of(f -> f
    .value(List.of(FieldValue.of(Table.TYPE_NAME),
    FieldValue.of(Column.TYPE_NAME))))))
    ._toQuery();

Exists

Exists queries return results where the asset contains a value for that attribute. For example, this query would find all assets that have been changed after being created:

Build the query and request
IndexSearchRequest index = client.assets.select() // (1)
.where(Asset.UPDATED_BY.hasAnyValue()) // (2)
.toRequest();
  1. You can search across all assets using the select() method of the assets member on any client.

  2. Chain a where() onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the person who last updated an asset. Adding the hasAnyValue() predicate creates an exists query. This will only match results where the field has some value on the asset.

    Equivalent query through Elastic
    Query byExistence = ExistsQuery.of(q -> q
    .field("__modifiedBy"))
    ._toQuery();

Range

Range queries return results where the asset's value for that attribute is within the range you're searching. (This works for numeric fields only—which for Atlan includes dates, since they're stored as epoch values.) For example, this query would find all assets that were created between January 1, 2022 to February 1, 2022:

Build the query and request
IndexSearchRequest index = client.assets.select() // (1)
.where(Asset.CREATE_TIME.between(1640995200000L, 1643673600000L)) // (2)
.toRequest();
  1. You can search across all assets using the select() method of the assets member on any client.

  2. Chain a where() onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the time an asset was created. Adding the between() predicate creates a range query. In this example between() allows you to specify two values any matching assets should be between. You could also use:

    • gt() for any values strictly greater than a single number
    • gte() for any values greater than or equal to a single number
    • lt() for values strictly less than a single number
    • lte() for values less than or equal to a single number
    • eq() for valuess strictly equal to a single number
    Equivalent query through Elastic
    Query byRange = RangeQuery.of(r -> r
    .field("__timestamp")
    .gte(JsonData.of(1640995200000L))
    .lt(JsonData.of(1643673600000L)))
    ._toQuery();

Prefix

Prefix queries return results where the asset's value for that attribute starts with what you're searching. For example, this query would find all columns whose qualifiedName starts with default/snowflake/1662194632 (in other words, all columns in any table, view, materialized view, schema or database in that connection):

What if I want it to be a case insensitive match?

You can still use term queries for case insensitive matching, too.

  • ** Java**: add a second parameter of true to the predicate method
  • ** Python**: add a named parameter of case_insensitive=True to the predicate method
  • ** Raw REST API**: send through "case_insensitive": true to the API directly
Build the query and request
IndexSearchRequest index = client.assets.select() // (1)
.where(Asset.QUALIFIED_NAME.startsWith("default/snowflake/1662194632", true)) // (2)
.toRequest();
  1. You can search across all assets using the select() method of the assets member on any client.

  2. Chain a where() onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the QUALIFIED_NAME of an Asset. Adding the startsWith() predicate creates a prefix query. This will only match results where the field's value starts with the provided string. You can also optionally send a second parameter as true to do a case-insensitive match.

    Equivalent query through Elastic
    Query byPrefix = PrefixQuery.of(p -> p
    .field("qualifiedName")
    .value("default/snowflake/1662194632"))
    ._toQuery();

Wildcard

Wildcard queries return results where the asset's value for that attribute matches the wildcard pattern you're searching. This can be useful for searching based on simple naming conventions. For example, this query would find all assets whose name starts with C_ and ends with _SK with any characters in-between:

Build the query and request
IndexSearchRequest index = client.assets.select() // (1)
.where(Asset.NAME.wildcard("C_*_SK", true)) // (2)
.toRequest();
  1. You can search across all assets using the select() method of the assets member on any client.

  2. Chain a where() onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the NAME of an Asset. Adding the wildcard() predicate creates a wildcard query. This will only match results where the field's name starts with C_ and ends with _SK. You can also optionally send a second parameter as true to do a case-insensitive match.

    Equivalent query through Elastic
    Query byWildcard = WildcardQuery.of(w -> w
    .field("name.keyword")
    .value("C_*_SK"))
    ._toQuery();

Regexp

Regexp queries return results where the asset's value for that attribute matches the regular expression you're searching. This can be useful for searching based on more complicated naming conventions. For example, this query would find all assets whose name starts with C_ and ends with _SK with the characters ADDR somewhere in-between:

Build the query and request
IndexSearchRequest index = client.assets.select() // (1)
.where(Asset.NAME.regex("C_[A-Za-z0-9_]*ADDR[A-Za-z0-9_]*_SK", true)) // (2)
.toRequest();
  1. You can search across all assets using the select() method of the assets member on any client.

  2. Chain a where() onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the NAME of an Asset. Adding the regex() predicate creates a regexp query. This will only match results where the field's value starts with C_, ends with _SK, and in-between has any alphanumeric characters and ADDR. You can also optionally send a second parameter as true to do a case-insensitive match.

    Equivalent query through Elastic
    Query byRegex = RegexpQuery.of(r -> r
    .field("name.keyword")
    .value("C_[A-Za-z0-9_]*ADDR[A-Za-z0-9_]*_SK"))
    ._toQuery();

Terms set

Terms set queries return results where the asset's values for that attribute matches a minimum number of the values you're searching for exactly. For example, this query would find all assets with at least two of the three specified Atlan tags:

Build the query and request
IndexSearchRequest index = client.assets.select() // (1)
.where(Asset.ATLAN_TAGS.in(List.of( // (2)
client.getAtlanTagCache().getIdForName("PII"),
client.getAtlanTagCache().getIdForName("SPI"),
client.getAtlanTagCache().getIdForName("Restricted"))),
2) // (3)
.toRequest();
  1. You can search across all assets using the select() method of the assets member on any client.

  2. Chain a where() onto the select, with the static constant representing a field of the type you want to search to start a query, in this case the ATLAN_TAGS of an Asset. Adding the in() predicate creates a terms query. This will only match results where the field's values exactly overlap with some number of values in the provided list.

  3. You therefore also need to specify how many values (minimally) must be present and overlapping in the field to be considered a match.

    Equivalent query through Elastic
    Query byTerms = TermsSetQuery.of(t -> t
    .field("__traitNames")
    .terms(List.of(AtlanTagCache.getIdForName("PII"),
    AtlanTagCache.getIdForName("SPI"),
    AtlanTagCache.getIdForName("Restricted")))
    .minimumShouldMatchScript(Script.of(s -> s
    .inline(InlineScript.of(i -> i
    .source("params.get('minimum');")
    .params(Map.of("minimum", JsonData.of(2))))))))
    ._toQuery();

Fuzzy

Fuzzy queries return results where the asset's value for that attribute is similar to the value you're searching. This is determined by Levenshtein edit distance (the number of one-character changes needed to match what you're searching).

Are you sure this is what you want?

This is a very simplistic fuzzy-matching algorithm, and it may end up matching both more and less than you want it to. For more advanced fuzzy-matching, you probably want to use full-text queries. Since this is possible through Atlan's search, it's included here for completeness.

For example, this query would find all assets whose name is 1-edit away (so would match block, clock, lock, black, etc):

Build the query
Query byLevenshtein = FuzzyQuery.of(f -> f
.field("name.keyword")
.value("block"))
.fuzziness("1")
._toQuery();
Build the request
IndexSearchRequest index = IndexSearchRequest
.builder(byLevenshtein)
.build();

Footnotes

  1. This page is a summary of the details in the Elasticsearch Guide's Term-level queries

  2. Ok, that's not strictly true, since as you'll see there are some term-level queries that give very basic fuzziness. And actually, a normalizer can be applied as well, to make these searches case-insensitive. But the intent of term-level queries is to do exact matches with minimal fuzziness.

Was this page helpful?