This isn’t going to be a nice handy list, but it will show how to quickly get such a list.
The basic principle is going to rely on sys.dm_fts_parser
. As described on the linked page this takes a string, and then breaks it down into ‘tokens’ based on the defined language, word breaker, thesaurus and stop list. As an example:
select * FROM sys.dm_fts_parser ('"the quick brown fox jumpsover"', 1033, 0, 0)
The typo is deliberate, for once!. And this returns:
So it’s returned a row for every token it can find. This example was nice and simple as it only contains words and spaces. The type was to show that without a word breaker SQL Server will just return the string without interpreting it. So that was for spaces, now just to confirm for other characters. First of all commas:
select * FROM sys.dm_fts_parser ('"quick,brown"', 1033, 0, 0)
So, that returns 2 rows as expected. How about hyphens?:
select * FROM sys.dm_fts_parser ('"quick-brown"', 1033, 0, 0)
Interesting, we get 3 rows this time. This makes sense when you think that hyphenation isn’t an exact science so some people will use them, and some won’t. So by combining both ‘sides’ of the hyphenated word SQL Server can hopefully match both uses.
So, that’s the basic theory. So this little piece of T-SQL is going to loop through all the 255 ASCII characters. For each one we’re going to use it to join 2 ‘words’, and then run that string through sys.dm_fts_parser. If the function returns more than 1 row we now that it’s found a word breaker, so we then output the character, and the character code as not all the characters are printable. You’ll also notice that code 34 throws an error, that’s because it’s ” which is a reserved character within full Text searches.
declare @i integer declare @cnt integer set @i=0 while @i<255 begin set @cnt=0 select @cnt=COUNT(1) FROM sys.dm_fts_parser ('"word1'+CHAR(@i)+'word2"', 1033, 0, 0) if @cnt>1 begin print 'this char - '+char(@i)+' - char('+convert(varchar(3),@i)+') is a word breaker' end set @i=@i+1 end
Which gives a nice long list:
this char – – char(1) is a word breaker this char – – char(2) is a word breaker this char – – char(3) is a word breaker this char – – char(4) is a word breaker this char – – char(5) is a word breaker this char – – char(6) is a word breaker this char – – char(7) is a word breaker this char – – char(8) is a word breaker this char – – char(9) is a word breaker this char – – char(10) is a word breaker this char – – char(11) is a word breaker this char – – char(12) is a word breaker this char – – char(13) is a word breaker this char – – char(14) is a word breaker this char – – char(15) is a word breaker this char – – char(16) is a word breaker this char – – char(17) is a word breaker this char – – char(18) is a word breaker this char – – char(19) is a word breaker this char – – char(20) is a word breaker this char – – char(21) is a word breaker this char – – char(22) is a word breaker this char – – char(23) is a word breaker this char – – char(24) is a word breaker this char – – char(25) is a word breaker this char – – char(26) is a word breaker this char – – char(27) is a word breaker this char – – char(28) is a word breaker this char – – char(29) is a word breaker this char – – char(30) is a word breaker this char – – char(31) is a word breaker this char – – char(32) is a word breaker this char – ! – char(33) is a word breaker Msg 7630, Level 15, State 3, Line 7 Syntax error near ‘word2’ in the full-text search condition ‘"word1"word2"’. this char – # – char(35) is a word breaker this char – $ – char(36) is a word breaker this char – % – char(37) is a word breaker this char – & – char(38) is a word breaker this char – ( – char(40) is a word breaker this char – ) – char(41) is a word breaker this char – * – char(42) is a word breaker this char – + – char(43) is a word breaker this char – , – char(44) is a word breaker this char – – – char(45) is a word breaker this char – . – char(46) is a word breaker this char – / – char(47) is a word breaker this char – : – char(58) is a word breaker this char – ; – char(59) is a word breaker this char – < – char(60) is a word breaker this char – = – char(61) is a word breaker this char – > – char(62) is a word breaker this char – ? – char(63) is a word breaker this char – @ – char(64) is a word breaker this char – [ – char(91) is a word breaker this char – \ – char(92) is a word breaker this char – ] – char(93) is a word breaker this char – ^ – char(94) is a word breaker this char – { – char(123) is a word breaker this char – | – char(124) is a word breaker this char – } – char(125) is a word breaker this char – ~ – char(126) is a word breaker this char – – char(127) is a word breaker this char –

About Stuart Moore
Nottingham based IT professional with over 15 years in the industry. Specialising in SQL Server, Infrastructure design, Disaster Recovery and Service Continuity management. Now happily probing the options the cloud provides
When not in front of a computer, is most likely to be found on the saddle of his bike