StringSplit

StringSplit["string"]

splits "string" into a list of substrings separated by whitespace.

StringSplit["string",patt]

splits into substrings separated by delimiters matching the string expression patt.

StringSplit["string",{p1,p2,}]

splits at any of the pi.

StringSplit["string",pattval]

inserts val at the position of each delimiter.

StringSplit["string",{p1v1,}]

inserts vi at the position of each delimiter pi.

StringSplit["string",patt,n]

splits into at most n substrings.

StringSplit[{s1,s2,},p]

gives the list of results for each of the si.

Details and Options

  • StringSplit[s] does not return the whitespace characters that delimit the substrings it returns.
  • Whitespace includes any number of spaces, tabs, and newlines.
  • The string expression patt can contain any of the objects specified in the notes for StringExpression.
  • StringSplit[s] is equivalent to StringSplit[s,Whitespace].
  • If s contains two adjacent delimiters, StringSplit considers there to be a zerolength substring "" between them.
  • StringSplit[s,patt] by default gives the list of substrings of s that occur between delimiters defined by patt; it does not include the delimiters themselves.
  • StringSplit[s,patt->val] includes val at the position of each delimiter.
  • StringSplit[s,patt:>val] evaluates val only when the pattern is found.
  • StringSplit["string",{p1->v1,,pa,}] includes v1 at the position of delimiters matching p1, but omits delimiters matching pa.
  • By default, StringSplit[s,patt] drops zerolength substrings associated with delimiters that appear at the beginning or end of s.
  • StringSplit[s,patt,All] returns all substrings, including zerolength ones at the beginning or end.
  • Setting the option IgnoreCase->True makes StringSplit treat lowercase and uppercase letters as equivalent.
  • StringSplit["string",RegularExpression["regex"]] splits at delimiters matching the specified regular expression.
  • StringSplit[BioSequence["type","seq"],patt,] will split the string "seq" by patt yielding a list of biomolecular sequences. In this case, degenerate letters in patt are interpreted as wildcard patterns based on the type of biomolecular sequence. Use Verbatim["patt"] to match degenerate letters literally.
  • The documentation for BioSequence lists the degenerate letters supported by each type of biomolecular sequence.

Examples

open allclose all

Basic Examples  (2)

Pick out substrings delimited by whitespace:

Show the substrings with quotes:

Split a string at every --:

Scope  (11)

Split at every run of spaces:

Use string patterns:

Regular expressions:

Mixed regular expressions and string patterns:

Split into substrings separated by either delimiter:

Insert a value at the position of a delimiter:

Include the delimiters in the output:

StringSplit automatically threads over lists of strings:

Split a DNA sequence by a particular substring:

Use a wildcard in the pattern to split the biomolecular sequence:

The "N" is a degenerate letter only in biomolecular sequences:

Split only on literal degenerate letters using Verbatim:

Generalizations & Extensions  (1)

All substrings, including zerolength ones at the beginning or end:

Options  (1)

IgnoreCase  (1)

Split a string at every "c", including uppercase letters:

Applications  (4)

Make a nested array by applying StringSplit twice:

A base random DNA string:

Sequences with adenine symmetrically placed:

Text analysis with some right and left context:

Use StringSplit to find all occurrences of the word "power":

Compute part of the left and right contexts in which each word occurs:

List extensions of files in a directory and its subdirectories:

Properties & Relations  (4)

Splitting at whitespace is equivalent to cases of non-whitespace sequences:

StringSplit with a rule is equivalent to StringReplace:

A null delimiter splits at every character:

Using StringSplit on a comma-separated values string:

In many cases Import and ImportString provide direct functionality:

Possible Issues  (1)

StringSplit by default splits only at whitespace:

This splits into words:

Another way to split into words:

Wolfram Research (2004), StringSplit, Wolfram Language function, https://reference.wolfram.com/language/ref/StringSplit.html (updated 2020).

Text

Wolfram Research (2004), StringSplit, Wolfram Language function, https://reference.wolfram.com/language/ref/StringSplit.html (updated 2020).

CMS

Wolfram Language. 2004. "StringSplit." Wolfram Language & System Documentation Center. Wolfram Research. Last Modified 2020. https://reference.wolfram.com/language/ref/StringSplit.html.

APA

Wolfram Language. (2004). StringSplit. Wolfram Language & System Documentation Center. Retrieved from https://reference.wolfram.com/language/ref/StringSplit.html

BibTeX

@misc{reference.wolfram_2024_stringsplit, author="Wolfram Research", title="{StringSplit}", year="2020", howpublished="\url{https://reference.wolfram.com/language/ref/StringSplit.html}", note=[Accessed: 22-November-2024 ]}

BibLaTeX

@online{reference.wolfram_2024_stringsplit, organization={Wolfram Research}, title={StringSplit}, year={2020}, url={https://reference.wolfram.com/language/ref/StringSplit.html}, note=[Accessed: 22-November-2024 ]}