ContentFieldOptions

ContentFieldOptions

is an option for CreateSearchIndex and related functions that allows options to be specified for handling different fields in content that is being indexed.

Details

  • ContentFieldOptions-><|"name1"->opts1,"name2"->opts2,|> specifies that the field named namei should be indexed using the options given in the association optsi.
  • Possible entries in each optsi association include:
  • "BulkRetrievalOptimized"whether to index the field to optimize for bulk retrieval
    "CamelCaseMatching"whether to camel case to match multiword forms
    "DeleteStopWords"whether to delete stop words before indexing
    "IgnoreCase"whether case is ignored for indexing and matching
    "Language"what language to assume the field is in
    "LengthWeighted"whether matches in shorter fields count more
    "Searchable"whether the field should be searchable
    "StemmingMethod"how to stem words for indexing and matching
    "Stored"whether to store the literal content of the field in the index
    "Tokenized"whether the field should be tokenized before indexing
    "Type"overall type of field
    "Weight"how to weight this field when searching
  • Typical types of fields include: "Title", "Text", "String", "Date", "DateTime", "Integer", "Real", "Boolean".
  • Different field types are given different default weights.
  • Field types such as "Title" and "Integer" are stored by default, while those such as "Text" are not.
  • "Title" and "Text" are tokenized and undergo stopword deletion by default, unlike "String" or "Date".
  • All field types are searchable by default.
  • All field types are not optimized for bulk retrieval by default.
  • By default, a match in a longer field will have a lower impact on the final score than a match in a shorter field. To disable this behavior, which is the default for all field types, set "LengthWeighted" to False.
  • The default value for "StemmingMethod" is "Porter". Alternative values include "Kstem" and None.
  • If explicit options are specified in addition to a type, the explicit options override defaults for that type.
  • All->opts can be used to indicate option settings to be used for all types by default.

Examples

Basic Examples  (12)

Create an example index, setting the language of "Field2" to French:

The French stopwords "le" and "la" are ignored, resulting in a match:

Store the textual content so that it is returned in the search result:

Setting the field type to "Field2" weights it more heavily when ranking search results, and also returns its value in content objects:

If exact case is important, "IgnoreCase" can be set to False for a field:

Since the case does not match, no results are found:

"CamelCaseMatching" can be disabled for non-word content if desired:

This would match if "CamelCaseMatching" were enabled:

Stemming can be disabled for non-word content:

This would match if stemming were enabled:

The "Weight" of a field can be specified for higher result ranking:

When the match is in the "Keyword" field, the score is multiplied by the "Weight" of 10:

Non-searchable fields cannot be searched but if stored, can be retrieved from the resulting content objects:

Disable stop word deletion for certain fields:

The stop word "or" is only found in those fields:

By default, a match in a longer field has a lower impact on the final score than a match in a shorter field:

This behavior can be disabled by setting "LengthWeighted" to False:

Set "Tokenized" to False to require a verbatim match of a field:

The field needs to be queried explicitly, or it is not matched:

The field is only matched verbatim:

When a field is used for document weighting, setting "BulkRetrievalOptimized" to True can improve the performance:

Wolfram Research (2016), ContentFieldOptions, Wolfram Language function, https://reference.wolfram.com/language/ref/ContentFieldOptions.html (updated 2017).

Text

Wolfram Research (2016), ContentFieldOptions, Wolfram Language function, https://reference.wolfram.com/language/ref/ContentFieldOptions.html (updated 2017).

CMS

Wolfram Language. 2016. "ContentFieldOptions." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2017. https://reference.wolfram.com/language/ref/ContentFieldOptions.html.

APA

Wolfram Language. (2016). ContentFieldOptions. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/ContentFieldOptions.html

BibTeX

@misc{reference.wolfram_2024_contentfieldoptions, author="Wolfram Research", title="{ContentFieldOptions}", year="2017", howpublished="\url{https://reference.wolfram.com/language/ref/ContentFieldOptions.html}", note=[Accessed: 22-December-2024 ]}

BibLaTeX

@online{reference.wolfram_2024_contentfieldoptions, organization={Wolfram Research}, title={ContentFieldOptions}, year={2017}, url={https://reference.wolfram.com/language/ref/ContentFieldOptions.html}, note=[Accessed: 22-December-2024 ]}