Rechercher dans le manuel MySQL
12.9.4 Full-Text Stopwords
        The stopword list is loaded and searched for full-text queries
        using the server character set and collation (the values of the
        character_set_server and
        collation_server system
        variables). False hits or misses might occur for stopword
        lookups if the stopword file or columns used for full-text
        indexing or searches have a character set or collation different
        from character_set_server or
        collation_server.
      
        Case sensitivity of stopword lookups depends on the server
        collation. For example, lookups are case insensitive if the
        collation is utf8mb4_0900_ai_ci, whereas
        lookups are case-sensitive if the collation is
        utf8mb4_0900_as_cs or
        utf8mb4_bin.
Stopwords for InnoDB Search Indexes
          InnoDB has a relatively short list of
          default stopwords, because documents from technical, literary,
          and other sources often use short words as keywords or in
          significant phrases. For example, you might search for
          “to be or not to be” and expect to get a sensible
          result, rather than having all those words ignored.
        
          To see the default InnoDB stopword list,
          query the
          INFORMATION_SCHEMA.INNODB_FT_DEFAULT_STOPWORD
          table.
        
- +-------+
- +-------+
- | a |
- | about |
- | an |
- | are |
- | at |
- | be |
- | by |
- | com |
- | de |
- | en |
- | for |
- | how |
- | i |
- | it |
- | la |
- | of |
- | that |
- | the |
- | this |
- | was |
- | what |
- | who |
- | will |
- | und |
- | the |
- | www |
- +-------+
          To define your own stopword list for all
          InnoDB tables, define a table with the same
          structure as the
          INNODB_FT_DEFAULT_STOPWORD table,
          populate it with stopwords, and set the value of the
          innodb_ft_server_stopword_table
          option to a value in the form
          db_name/table_nameVARCHAR column
          named value. The following example
          demonstrates creating and configuring a new global stopword
          table for InnoDB.
        
- -- Create a new stopword table
- Query OK, 0 rows affected (0.01 sec)
- -- Insert stopwords (for simplicity, a single stopword is used in this example)
- Query OK, 1 row affected (0.00 sec)
- -- Create the table
- Query OK, 0 rows affected (0.01 sec)
- -- Insert data into the table
- ('Call me Ishmael.','Herman Melville','Moby-Dick'),
- ('A screaming comes across the sky.','Thomas Pynchon','Gravity\'s Rainbow'),
- ('I am an invisible man.','Ralph Ellison','Invisible Man'),
- ('Where now? Who now? When now?','Samuel Beckett','The Unnamable'),
- ('It was love at first sight.','Joseph Heller','Catch-22'),
- ('All this happened, more or less.','Kurt Vonnegut','Slaughterhouse-Five'),
- ('Mrs. Dalloway said she would buy the flowers herself.','Virginia Woolf','Mrs. Dalloway'),
- ('It was a pleasure to burn.','Ray Bradbury','Fahrenheit 451');
- Query OK, 8 rows affected (0.00 sec)
- -- Set the innodb_ft_server_stopword_table option to the new stopword table
- Query OK, 0 rows affected (0.00 sec)
- -- Create the full-text index (which rebuilds the table if no FTS_DOC_ID column is defined)
- Query OK, 0 rows affected, 1 warning (1.17 sec)
          Verify that the specified stopword ('Ishmael') does not appear
          by querying the words in
          INFORMATION_SCHEMA.INNODB_FT_INDEX_TABLE.
            By default, words less than 3 characters in length or
            greater than 84 characters in length do not appear in an
            InnoDB full-text search index. Maximum
            and minimum word length values are configurable using the
            innodb_ft_max_token_size
            and
            innodb_ft_min_token_size
            variables. This default behavior does not apply to the ngram
            parser plugin. ngram token size is defined by the
            ngram_token_size option.
- Query OK, 0 rows affected (0.00 sec)
- +-----------+
- | word |
- +-----------+
- | across |
- | burn |
- | buy |
- | comes |
- | dalloway |
- | flowers |
- | happened |
- | herself |
- | invisible |
- | less |
- | love |
- | man |
- +-----------+
          To create stopword lists on a table-by-table basis, create
          other stopword tables and use the
          innodb_ft_user_stopword_table
          option to specify the stopword table that you want to use
          before you create the full-text index.
          The stopword file is loaded and searched using
          latin1 if
          character_set_server is
          ucs2, utf16,
          utf16le, or utf32.
        
          
          
          To override the default stopword list for MyISAM tables, set
          the ft_stopword_file system
          variable. (See Section 5.1.8, “Server System Variables”.) The
          variable value should be the path name of the file containing
          the stopword list, or the empty string to disable stopword
          filtering. The server looks for the file in the data directory
          unless an absolute path name is given to specify a different
          directory. After changing the value of this variable or the
          contents of the stopword file, restart the server and rebuild
          your FULLTEXT indexes.
        
          The stopword list is free-form, separating stopwords with any
          nonalphanumeric character such as newline, space, or comma.
          Exceptions are the underscore character (_)
          and a single apostrophe (') which are
          treated as part of a word. The character set of the stopword
          list is the server's default character set; see
          Section 10.3.2, “Server Character Set and Collation”.
        
          The following list shows the default stopwords for
          MyISAM search indexes. In a MySQL source
          distribution, you can find this list in the
          storage/myisam/ft_static.c file.
        
a's           able          about         above         according
accordingly   across        actually      after         afterwards
again         against       ain't         all           allow
allows        almost        alone         along         already
also          although      always        am            among
amongst       an            and           another       any
anybody       anyhow        anyone        anything      anyway
anyways       anywhere      apart         appear        appreciate
appropriate   are           aren't        around        as
aside         ask           asking        associated    at
available     away          awfully       be            became
because       become        becomes       becoming      been
before        beforehand    behind        being         believe
below         beside        besides       best          better
between       beyond        both          brief         but
by            c'mon         c's           came          can
can't         cannot        cant          cause         causes
certain       certainly     changes       clearly       co
com           come          comes         concerning    consequently
consider      considering   contain       containing    contains
corresponding could         couldn't      course        currently
definitely    described     despite       did           didn't
different     do            does          doesn't       doing
don't         done          down          downwards     during
each          edu           eg            eight         either
else          elsewhere     enough        entirely      especially
et            etc           even          ever          every
everybody     everyone      everything    everywhere    ex
exactly       example       except        far           few
fifth         first         five          followed      following
follows       for           former        formerly      forth
four          from          further       furthermore   get
gets          getting       given         gives         go
goes          going         gone          got           gotten
greetings     had           hadn't        happens       hardly
has           hasn't        have          haven't       having
he            he's          hello         help          hence
her           here          here's        hereafter     hereby
herein        hereupon      hers          herself       hi
him           himself       his           hither        hopefully
how           howbeit       however       i'd           i'll
i'm           i've          ie            if            ignored
immediate     in            inasmuch      inc           indeed
indicate      indicated     indicates     inner         insofar
instead       into          inward        is            isn't
it            it'd          it'll         it's          its
itself        just          keep          keeps         kept
know          known         knows         last          lately
later         latter        latterly      least         less
lest          let           let's         like          liked
likely        little        look          looking       looks
ltd           mainly        many          may           maybe
me            mean          meanwhile     merely        might
more          moreover      most          mostly        much
must          my            myself        name          namely
nd            near          nearly        necessary     need
needs         neither       never         nevertheless  new
next          nine          no            nobody        non
none          noone         nor           normally      not
nothing       novel         now           nowhere       obviously
of            off           often         oh            ok
okay          old           on            once          one
ones          only          onto          or            other
others        otherwise     ought         our           ours
ourselves     out           outside       over          overall
own           particular    particularly  per           perhaps
placed        please        plus          possible      presumably
probably      provides      que           quite         qv
rather        rd            re            really        reasonably
regarding     regardless    regards       relatively    respectively
right         said          same          saw           say
saying        says          second        secondly      see
seeing        seem          seemed        seeming       seems
seen          self          selves        sensible      sent
serious       seriously     seven         several       shall
she           should        shouldn't     since         six
so            some          somebody      somehow       someone
something     sometime      sometimes     somewhat      somewhere
soon          sorry         specified     specify       specifying
still         sub           such          sup           sure
t's           take          taken         tell          tends      
th            than          thank         thanks        thanx
that          that's        thats         the           their
theirs        them          themselves    then          thence
there         there's       thereafter    thereby       therefore
therein       theres        thereupon     these         they
they'd        they'll       they're       they've       think
third         this          thorough      thoroughly    those
though        three         through       throughout    thru
thus          to            together      too           took
toward        towards       tried         tries         truly
try           trying        twice         two           un
under         unfortunately unless        unlikely      until
unto          up            upon          us            use
used          useful        uses          using         usually
value         various       very          via           viz
vs            want          wants         was           wasn't
way           we            we'd          we'll         we're
we've         welcome       well          went          were
weren't       what          what's        whatever      when
whence        whenever      where         where's       whereafter
whereas       whereby       wherein       whereupon     wherever
whether       which         while         whither       who
who's         whoever       whole         whom          whose
why           will          willing       wish          with
within        without       won't         wonder        would
wouldn't      yes           yet           you           you'd
you'll        you're        you've        your          yours
yourself      yourselves    zeroDocument created the 26/06/2006, last modified the 26/10/2018
Source of the printed document:https://www.gaudry.be/en/mysql-rf-fulltext-stopwords.html
The infobrol is a personal site whose content is my sole responsibility. The text is available under CreativeCommons license (BY-NC-SA). More info on the terms of use and the author.
References
These references and links indicate documents consulted during the writing of this page, or which may provide additional information, but the authors of these sources can not be held responsible for the content of this page.
The author of this site is solely responsible for the way in which the various concepts, and the freedoms that are taken with the reference works, are presented here. Remember that you must cross multiple source information to reduce the risk of errors.
 
  
  
  
 