テキストコンテンツタイプ

TextCasesTextPositionTextContents等の自然言語処理関数は,テキスト中のさまざまなタイプのコンテンツが認識できる.これらのタイプには,構造的,文法的なものや,意味解釈に関係するもの等がある.

Containing マッチさせるコンテナ(文等)を定義する

Alternatives 複数のタイプのマッチング

Verbatim 文字列の逐語的マッチング

StringExpression  ▪  RegularExpression

構造要素

"Word" 単語的部分(通常空白文字や句読点で区切られる)

"Sentence" 文的部分(通常句読点文字で区切られる)

"Paragraph" 段落的部分(通常複数の新規行で区切られる)

"Quotation" 引用符で区切られた引用句

"Line" 新規行で区切られた部分文字列

"NonText" 通常の文字的テキストではない文字

"Punctuation" 句読点

"Whitespace" 空白文字列

"Emoticon" 顔文字(スマイリーフェイス等)

音声の部分

"Noun"  ▪  "Verb"  ▪  "Adjective"  ▪  "Adverb"  ▪  "Pronoun"  ▪  "Preposition"  ▪  "Conjunction"  ▪  "Determiner"  ▪  "Interjection"

"ProperNoun" 通常大文字で始まる固有名詞

"WhPronoun"  ▪  "WhAdverb"  ▪  "WhDeterminer"

"Punctuation"  ▪  "PossessiveModifier"  ▪  "ListItemMarker"  ▪  "Symbol"  ▪  "ForeignWord"

句のタイプ

"NounPhrase"  ▪  "VerbPhrase"  ▪  "AdjectivePhrase"  ▪  "AdverbPhrase"  ▪  "PrepositionalPhrase"  ▪  "ConjunctionPhrase"

"WhNounPhrase"  ▪  "WhAdjectivePhrase"  ▪  "WhAdverbPhrase"  ▪  "WhPrepositionalPhrase"

"NounPhraseHead"  ▪  "QuantifierPhrase"  ▪  "UnlikeCoordinatedPhrase"

"Clause"  ▪  "ReducedRelativeClause"

"Sentence"  ▪  "Fragment"  ▪  "Parenthetical"  ▪  "ListMarker"

数量の要素

"Number" 数(「67」,「6.78」,「6.78e+10」,「two thousand」等)

"Quantity" 単位付き数量(「4.5 km」,「10 ft. 6 in.」,「30C」,「7 m/s」,「three kilometers」等)

"Unit" 単位(「km」,「ft.」,「m/s」,「kilometers」等)

"CurrencyAmount" 通貨の額(「$5」,「45 pesos」,「10.25 GBP」,「seven euros」等)

"Color" テキストで記された色(「light blue」等)

時間と場所の要素

"Date" 日付または日付要素(日,月,年,世紀等)

"Location" 名前付き地理的位置(「New York」,「France」等)

"LocationEntity" 名前付き地理的位置およびその実体の解釈

識別要素

"EmailAddress"  ▪  "IPAddress"  ▪  "PhoneNumber"  ▪  "URL"  ▪  "ZIPCode"

"TwitterHandle" Twitterハンドル(「@Wolfram」等)

実体

Entity 任意のタイプの特定の実体にマッチする.例:

地理的実体

"Country"  ▪  "AdministrativeDivision"  ▪  "City"  ▪  "Neighborhood"  ▪  "MetropolitanArea"  ▪  "GeographicRegion"

"Ocean"  ▪  "Island"  ▪  "UnderseaFeature"  ▪  "Reef"  ▪  "Beach"  ▪  "Lake"  ▪  "Mountain"  ▪  "Volcano"  ▪  "River"  ▪  "Waterfall"  ▪  "EarthImpact"  ▪  "Desert"  ▪  "Forest"

"Airport"  ▪  "Park"  ▪  "AmusementPark"  ▪  "AmusementParkRide"  ▪  "Stadium"

"Bridge"  ▪  "Canal"  ▪  "Tunnel"  ▪  "Dam"  ▪  "Mine"  ▪  "Cave"  ▪  "OilField"  ▪  "Building"  ▪  "Castle"  ▪  "Cemetery"  ▪  "HistoricalSite"  ▪  "ReserveLand"  ▪  "Shipwreck"

"University"  ▪  "SchoolDistrict"  ▪  "PublicSchool"  ▪  "PrivateSchool"  ▪  "Museum"  ▪  "LibraryBranch"  ▪  "LibrarySystem"

"WeatherStation"  ▪  "AstronomicalObservatory"  ▪  "ParticleAccelerator"  ▪  "NuclearReactor"  ▪  "NuclearTestSite"  ▪  "NuclearExplosion"

"TimeZone"

天文学的実体

"Planet"  ▪  "PlanetaryMoon"  ▪  "MinorPlanet"  ▪  "Comet"  ▪  "SolarSystemFeature"  ▪  "MeteorShower"  ▪  "Exoplanet"

"Star"  ▪  "Galaxy"  ▪  "StarCluster"  ▪  "Nebula"  ▪  "Supernova"  ▪  "Pulsar"  ▪  "AstronomicalRadioSource"  ▪  "Constellation"

宇宙関連

"Satellite"  ▪  "Rocket"  ▪  "DeepSpaceProbe"  ▪  "MannedSpaceMission"

天候と地球科学

"WeatherStation"  ▪  "TropicalStorm"  ▪  "Cloud"  ▪  "AtmosphericLayer"

"GeologicalLayer"  ▪  "GeologicalPeriod"  ▪  "Mineral"  ▪  "FamousGem"

輸送関連

"Aircraft"  ▪  "Airline"  ▪  "Airport"  ▪  "Ship"

工学と構造

"BroadcastStation"  ▪  "MeasurementDevice"

"Building"  ▪  "Bridge"  ▪  "Tunnel"  ▪  "Dam"  ▪  "Mine"

文化と娯楽

"Language"  ▪  "Religion"  ▪  "Mythology"

"Movie"  ▪  "MusicAct"  ▪  "MusicAlbum"  ▪  "MusicWork"  ▪  "BroadcastStation"

"Book"  ▪  "Artwork"  ▪  "Periodical"  ▪  "FictionalCharacter"

"Museum"  ▪  "LibraryBranch"  ▪  "LibrarySystem"

活動と趣味

"MusicalInstrument"  ▪  "SportObject"  ▪  "BoardGame"

食物と栄養

"Food"  ▪  "FoodBrandName"  ▪  "FoodManufacturer"  ▪  "FoodSubBrandName"

金融

"Company"  ▪  "Financial"

人と個人的属性

"Person"  ▪  "GivenName"  ▪  "Surname"  ▪  "PersonTitle"  ▪  "Occupation"

歴史関連

"HistoricalCountry"  ▪  "HistoricalSite"

言語学的実体

"Language"  ▪  "Alphabet"  ▪  "WritingScript"

物理化学

"Chemical"  ▪  "Element"  ▪  "Particle"  ▪  "Mineral"

"FamousPhysicsProblem"  ▪  "FamousChemistryProblem"

生命科学

"Gene"  ▪  "Protein"

医学的実体

"AnatomicalStructure"  ▪  "Disease"  ▪  "MedicalTest"  ▪  "Protein"

生命体のタイプ

"Plant"  ▪  "Species"  ▪  "DogBreed"  ▪  "CatBreed"  ▪  "Dinosaur"

数学的実体

"Polyhedron"  ▪  "Surface"  ▪  "SpaceCurve"  ▪  "Graph"  ▪  "FiniteGroup"  ▪  "IntegerSequence"

"FamousMathProblem"  ▪  "FamousMathGame"

計算関連

"NotableComputer"  ▪  "ProgrammingLanguage"

言語のスタイルと感情

"PositiveSentiment"  ▪  "NegativeSentiment"  ▪  "NeutralSentiment"

"Profanity" 不敬が含まれるテキスト

コンテンツのトピック

"BooksTopic"  ▪  "CareerAndMoneyTopic"  ▪  "FamilyAndFriendsTopic"  ▪  "FashionTopic"  ▪  "FitnessTopic"  ▪  "FoodAndDrinkTopic"  ▪  "HealthTopic"  ▪  "LeisureTopic"  ▪  "MoviesTopic"  ▪  "MusicTopic"  ▪  "PersonalMoodTopic"  ▪  "PetsAndAnimalsTopic"  ▪  "PoliticsTopic"  ▪  "QuotesAndLifePhilosophyTopic"  ▪  "RelationshipsTopic"  ▪  "SchoolAndUniversityTopic"  ▪  "SocialMediaTopic"  ▪  "SpecialOccasionsTopic"  ▪  "SportsTopic"  ▪  "TechnologyTopic"  ▪  "TelevisionTopic"  ▪  "TransportTopic"  ▪  "TravelTopic"  ▪  "VideoGamesTopic"  ▪  "WeatherTopic"

人間の言語

"Afrikaans"  ▪  "Albanian"  ▪  "Amharic"  ▪  "Arabic"  ▪  "Armenian"  ▪  "Azerbaijani"  ▪  "Basque"  ▪  "Bengali"  ▪  "Bosnian"  ▪  "Bulgarian"  ▪  "Catalan"  ▪  "Chinese"  ▪  "Croatian"  ▪  "Czech"  ▪  "Danish"  ▪  "Dutch"  ▪  "English"  ▪  "Esperanto"  ▪  "Estonian"  ▪  "Finnish"  ▪  "French"  ▪  "Georgian"  ▪  "German"  ▪  "Greek"  ▪  "Gujarati"  ▪  "Hebrew"  ▪  "Hindi"  ▪  "Hungarian"  ▪  "Icelandic"  ▪  "InuktitutGreenlandic"  ▪  "Italian"  ▪  "Japanese"  ▪  "Kannada"  ▪  "Kazakh"  ▪  "Khmer"  ▪  "Korean"  ▪  "Latvian"  ▪  "Lithuanian"  ▪  "Macedonian"  ▪  "Majhi"  ▪  "Malay"  ▪  "Malayalam"  ▪  "Mongolian"  ▪  "Nepali"  ▪  "NorwegianBokmal"  ▪  "Persian"  ▪  "Polish"  ▪  "Portuguese"  ▪  "Romanian"  ▪  "Russian"  ▪  "Serbian"  ▪  "Sinhala"  ▪  "Slovak"  ▪  "Slovenian"  ▪  "Spanish"  ▪  "Swahili"  ▪  "Swedish"  ▪  "Tagalog"  ▪  "Tamil"  ▪  "Telugu"  ▪  "Thai"  ▪  "Turkish"  ▪  "Ukrainian"  ▪  "Urdu"  ▪  "UzbekNorthern"  ▪  "Vietnamese"  ▪  "Welsh"

プログラミング言語

"ABAP"  ▪  "Ada"  ▪  "AWK"  ▪  "BourneShell"  ▪  "C"  ▪  "CPlusPlus"  ▪  "CSharp"  ▪  "COBOL"  ▪  "CommonLisp"  ▪  "D"  ▪  "Dart"  ▪  "Delphi"  ▪  "Erlang"  ▪  "FSharp"  ▪  "Fortran"  ▪  "Groovy"  ▪  "Haskell"  ▪  "Java"  ▪  "JavaScript"  ▪  "Logo"  ▪  "Lua"  ▪  "MATLAB"  ▪  "ObjectiveC"  ▪  "Perl"  ▪  "PHP"  ▪  "Prolog"  ▪  "Python"  ▪  "R"  ▪  "Ruby"  ▪  "Rust"  ▪  "SAS"  ▪  "Scala"  ▪  "Scheme"  ▪  "SQL"  ▪  "Swift"  ▪  "Tcl"  ▪  "VBSCript"  ▪  "VisualBasicNET"  ▪  "WindowsPowerShell"  ▪  "WolframLanguage"