I couldn’t find this anywhere online so I figured it out myself. You can run the below script to get all of the defined word breakers for a given language. That is, the characters that will be considered a word separator by the full text search engine.
To figure it out, I loop through the first thousand unicode characters, create a string by appending a character in front and behind, and then pass it to the parser engine with sys.dm_fts_parser. If two results are returned, the character broke the word, if one, it didn’t.
You could change the language by updating the 1033 parameter to whatever you need it to be.
--unicode goes from 0 to 65535 --most used values < 1000 --can get the unicode value of a char with function UNICODE('x') -- get the char from an int with NCHAR(x) DECLARE @results TABLE(Breaker nchar, IntValue int) DECLARE @i int; SET @i = 0; WHILE @i < 1000 BEGIN DECLARE @sql NVARCHAR(600); SET @sql = 'a' + NCHAR(@i) + 'b'; DECLARE @ret int; -- 1033 = english, first zero = system stoplist, second zero = no accent sensitivity Select @ret = count(*) from sys.dm_fts_parser(@sql,1033,0,0) if @ret > 1 BEGIN INSERT INTO @results (Breaker, INtValue) values (NCHAR(@i), @i); END SET @i = @i + 1 END SELECT distinct Breaker from @results;