当前位置 : 首页 » 互动问答 » 正文

Troubleshooting "Illegal mix of collations" error in mysql

分类 : 互动问答 | 发布时间 : 2010-06-13 00:23:43 | 评论 : 13 | 浏览 : 200156 | 喜欢 : 161

Am getting the below error when trying to do a select through a stored procedure in MySQL.

Illegal mix of collations (latin1_general_cs,IMPLICIT) and (latin1_general_ci,IMPLICIT) for operation '='

Any idea on what might be going wrong here?

The collation of the table is latin1_general_ci and that of the column in the where clause is latin1_general_cs.

回答(13)

  • 1楼
  • This is generally caused by comparing two strings of incompatible collation or by attempting to select data of different collation into a combined column.

    The clause COLLATE allows you to specify the collation used in the query.

    For example, the following WHERE clause will always give the error you posted:

    WHERE 'A' COLLATE latin1_general_ci = 'A' COLLATE latin1_general_cs
    

    Your solution is to specify a shared collation for the two columns within the query. Here is an example that uses the COLLATE clause:

    SELECT * FROM table ORDER BY key COLLATE latin1_general_ci;
    

    Another option is to use the BINARY operator:

    BINARY str is the shorthand for CAST(str AS BINARY).

    Your solution might look something like this:

    SELECT * FROM table WHERE BINARY a = BINARY b;
    

    or,

    SELECT * FROM table ORDER BY BINARY a;
    
  • 2楼
  • TL;DR

    Either change the collation of one (or both) of the strings so that they match, or else add a COLLATE clause to your expression.


    1. What is this "collation" stuff anyway?

      As documented under Character Sets and Collations in General:

      A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. Let's make the distinction clear with an example of an imaginary character set.

      Suppose that we have an alphabet with four letters: “A”, “B”, “a”, “b”. We give each letter a number: “A” = 0, “B” = 1, “a” = 2, “b” = 3. The letter “A” is a symbol, the number 0 is the encoding for “A”, and the combination of all four letters and their encodings is a character set.

      Suppose that we want to compare two string values, “A” and “B”. The simplest way to do this is to look at the encodings: 0 for “A” and 1 for “B”. Because 0 is less than 1, we say “A” is less than “B”. What we've just done is apply a collation to our character set. The collation is a set of rules (only one rule in this case): “compare the encodings.” We call this simplest of all possible collations a binary collation.

      But what if we want to say that the lowercase and uppercase letters are equivalent? Then we would have at least two rules: (1) treat the lowercase letters “a” and “b” as equivalent to “A” and “B”; (2) then compare the encodings. We call this a case-insensitive collation. It is a little more complex than a binary collation.

      In real life, most character sets have many characters: not just “A” and “B” but whole alphabets, sometimes multiple alphabets or eastern writing systems with thousands of characters, along with many special symbols and punctuation marks. Also in real life, most collations have many rules, not just for whether to distinguish lettercase, but also for whether to distinguish accents (an “accent” is a mark attached to a character as in German “Ö”), and for multiple-character mappings (such as the rule that “Ö” = “OE” in one of the two German collations).

      Further examples are given under Examples of the Effect of Collation.

    2. Okay, but how does MySQL decide which collation to use for a given expression?

      As documented under Collation of Expressions:

      In the great majority of statements, it is obvious what collation MySQL uses to resolve a comparison operation. For example, in the following cases, it should be clear that the collation is the collation of column charset_name:

      SELECT x FROM T ORDER BY x;
      SELECT x FROM T WHERE x = x;
      SELECT DISTINCT x FROM T;
      

      However, with multiple operands, there can be ambiguity. For example:

      SELECT x FROM T WHERE x = 'Y';
      

      Should the comparison use the collation of the column x, or of the string literal 'Y'? Both x and 'Y' have collations, so which collation takes precedence?

      Standard SQL resolves such questions using what used to be called “coercibility” rules.

      [ deletia ]

      MySQL uses coercibility values with the following rules to resolve ambiguities:

      • Use the collation with the lowest coercibility value.

      • If both sides have the same coercibility, then:

        • If both sides are Unicode, or both sides are not Unicode, it is an error.

        • If one of the sides has a Unicode character set, and another side has a non-Unicode character set, the side with Unicode character set wins, and automatic character set conversion is applied to the non-Unicode side. For example, the following statement does not return an error:

          SELECT CONCAT(utf8_column, latin1_column) FROM t1;
          

          It returns a result that has a character set of utf8 and the same collation as utf8_column. Values of latin1_column are automatically converted to utf8 before concatenating.

        • For an operation with operands from the same character set but that mix a _bin collation and a _ci or _cs collation, the _bin collation is used. This is similar to how operations that mix nonbinary and binary strings evaluate the operands as binary strings, except that it is for collations rather than data types.

    3. So what is an "illegal mix of collations"?

      An "illegal mix of collations" occurs when an expression compares two strings of different collations but of equal coercibility and the coercibility rules cannot help to resolve the conflict. It is the situation described under the third bullet-point in the above quotation.

      The particular error given in the question, Illegal mix of collations (latin1_general_cs,IMPLICIT) and (latin1_general_ci,IMPLICIT) for operation '=', tells us that there was an equality comparison between two non-Unicode strings of equal coercibility. It furthermore tells us that the collations were not given explicitly in the statement but rather were implied from the strings' sources (such as column metadata).

    4. That's all very well, but how does one resolve such errors?

      As the manual extracts quoted above suggest, this problem can be resolved in a number of ways, of which two are sensible and to be recommended:

      • Change the collation of one (or both) of the strings so that they match and there is no longer any ambiguity.

        How this can be done depends upon from where the string has come: Literal expressions take the collation specified in the collation_connection system variable; values from tables take the collation specified in their column metadata.

      • Force one string to not be coercible.

        I omitted the following quote from the above:

        MySQL assigns coercibility values as follows:

        • An explicit COLLATE clause has a coercibility of 0. (Not coercible at all.)

        • The concatenation of two strings with different collations has a coercibility of 1.

        • The collation of a column or a stored routine parameter or local variable has a coercibility of 2.

        • A “system constant” (the string returned by functions such as USER() or VERSION()) has a coercibility of 3.

        • The collation of a literal has a coercibility of 4.

        • NULL or an expression that is derived from NULL has a coercibility of 5.

        Thus simply adding a COLLATE clause to one of the strings used in the comparison will force use of that collation.

      Whilst the others would be terribly bad practice if they were deployed merely to resolve this error:

      • Force one (or both) of the strings to have some other coercibility value so that one takes precedence.

        Use of CONCAT() or CONCAT_WS() would result in a string with a coercibility of 1; and (if in a stored routine) use of parameters/local variables would result in strings with a coercibility of 2.

      • Change the encodings of one (or both) of the strings so that one is Unicode and the other is not.

        This could be done via transcoding with CONVERT(expr USING transcoding_name); or via changing the underlying character set of the data (e.g. modifying the column, changing character_set_connection for literal values, or sending them from the client in a different encoding and changing character_set_client / adding a character set introducer). Note that changing encoding will lead to other problems if some desired characters cannot be encoded in the new character set.

      • Change the encodings of one (or both) of the strings so that they are both the same and change one string to use the relevant _bin collation.

        Methods for changing encodings and collations have been detailed above. This approach would be of little use if one actually needs to apply more advanced collation rules than are offered by the _bin collation.

  • 3楼
  • Adding my 2c to the discussion for future googlers.

    I was investigating a similar issue where I got the following error when using custom functions that recieved a varchar parameter:

    Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and 
    (utf8_general_ci,IMPLICIT) for operation '='
    

    Using the following query:

    mysql> show variables like "collation_database";
        +--------------------+-----------------+
        | Variable_name      | Value           |
        +--------------------+-----------------+
        | collation_database | utf8_general_ci |
        +--------------------+-----------------+
    

    I was able to tell that the DB was using utf8_general_ci, while the tables were defined using utf8_unicode_ci:

    mysql> show table status;
        +--------------+-----------------+
        | Name         | Collation       |
        +--------------+-----------------+
        | my_view      | NULL            |
        | my_table     | utf8_unicode_ci |
        ...
    

    Notice that the views have NULL collation. It appears that views and functions have collation definitions even though this query shows null for one view. The collation used is the DB collation that was defined when the view/function were created.

    The sad solution was to both change the db collation and recreate the views/functions to force them to use the current collation.

    • Changing the db's collation:

      ALTER DATABASE mydb DEFAULT COLLATE utf8_unicode_ci;
      

    I hope this will help someone.

  • 4楼
  • 有时转换字符集会很危险,特别是在拥有大量数据的数据库上。我认为最好的选择是使用“二进制”运算符:

    例如:WHERE binary table1.column1 = binary table2.column1
  • 5楼
  • You can try this script, that converts all of your databases and tables to utf8.

  • 6楼
  • I had a similar problem, was trying to use the FIND_IN_SET procedure with a string variable.

    SET @my_var = 'string1,string2';
    SELECT * from my_table WHERE FIND_IN_SET(column_name,@my_var);
    

    and was receiving the error

    Error Code: 1267. Illegal mix of collations (utf8_unicode_ci,IMPLICIT) and (utf8_general_ci,IMPLICIT) for operation 'find_in_set'

    Short answer:

    No need to change any collation_YYYY variables, just add the correct collation next to your variable declaration, i.e.

    SET @my_var = 'string1,string2' COLLATE utf8_unicode_ci;
    SELECT * from my_table WHERE FIND_IN_SET(column_name,@my_var);
    

    Long answer:

    I first checked the collation variables:

    mysql> SHOW VARIABLES LIKE 'collation%';
        +----------------------+-----------------+
        | Variable_name        | Value           |
        +----------------------+-----------------+
        | collation_connection | utf8_general_ci |
        +----------------------+-----------------+
        | collation_database   | utf8_general_ci |
        +----------------------+-----------------+
        | collation_server     | utf8_general_ci |
        +----------------------+-----------------+
    

    Then I checked the table collation:

    mysql> SHOW CREATE TABLE my_table;
    
    CREATE TABLE `my_table` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `column_name` varchar(40) COLLATE utf8_unicode_ci DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM AUTO_INCREMENT=125 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
    

    This means that my variable was configured with the default collation of utf8_general_ci while my table was configured as utf8_unicode_ci.

    By adding the COLLATE command next to the variable declaration, the variable collation matched the collation configured for the table.

  • 7楼
  • MySQL really dislikes mixing collations unless it can coerce them to the same one (which clearly is not feasible in your case). Can't you just force the same collation to be used via a COLLATE clause? (or the simpler BINARY shortcut if applicable...).

  • 8楼
  • Solution if literals are involved.

    I am using Pentaho Data Integration and dont get to specify the sql syntax. Using a very simple DB lookup gave the error "Illegal mix of collations (cp850_general_ci,COERCIBLE) and (latin1_swedish_ci,COERCIBLE) for operation '='"

    The generated code was "SELECT DATA_DATE AS latest_DATA_DATE FROM hr_cc_normalised_data_date_v WHERE PSEUDO_KEY = ?"

    Cutting the story short the lookup was to a view and when I issued

    mysql> show full columns from hr_cc_normalised_data_date_v;
    +------------+------------+-------------------+------+-----+
    | Field      | Type       | Collation         | Null | Key |
    +------------+------------+-------------------+------+-----+
    | PSEUDO_KEY | varchar(1) | cp850_general_ci  | NO   |     |
    | DATA_DATE  | varchar(8) | latin1_general_cs | YES  |     |
    +------------+------------+-------------------+------+-----+
    

    which explains where the 'cp850_general_ci' comes from.

    The view was simply created with 'SELECT 'X',......' According to the manual literals like this should inherit their character set and collation from server settings which were correctly defined as 'latin1' and 'latin1_general_cs' as this clearly did not happen I forced it in the creation of the view

    CREATE OR REPLACE VIEW hr_cc_normalised_data_date_v AS
    SELECT convert('X' using latin1) COLLATE latin1_general_cs        AS PSEUDO_KEY
        ,  DATA_DATE
    FROM HR_COSTCENTRE_NORMALISED_mV
    LIMIT 1;
    

    now it shows latin1_general_cs for both columns and the error has gone away. :)

  • 9楼
  • If the columns that you are having trouble with are "hashes", then consider the following...

    If the "hash" is a binary string, you should really use BINARY(...) datatype.

    If the "hash" is a hex string, you do not need utf8, and should avoid such because of character checks, etc. For example, MySQL's MD5(...) yields a fixed-length 32-byte hex string. SHA1(...) gives a 40-byte hex string. This could be stored into CHAR(32) CHARACTER SET ascii (or 40 for sha1).

    Or, better yet, store UNHEX(MD5(...)) into BINARY(16). This cuts in half the size of the column. (It does, however, make it rather unprintable.) SELECT HEX(hash) ... if you want it readable.

    Comparing two BINARY columns has no collation issues.

  • 11楼
  • One another source of the issue with collations is mysql.proc table. Check collations of your storage procedures and functions:

    SELECT
      p.db, p.db_collation, p.type, COUNT(*) cnt
    FROM mysql.proc p
    GROUP BY p.db, p.db_collation, p.type;
    

    Also pay attention to mysql.proc.collation_connection and mysql.proc.character_set_client columns.

  • 12楼
  • I used ALTER DATABASE mydb DEFAULT COLLATE utf8_unicode_ci;, but didn't work.

    In this query:

    Select * from table1, table2 where table1.field = date_format(table2.field,'%H');
    

    This work for me:

    Select * from table1, table2 where concat(table1.field) = date_format(table2.field,'%H');
    

    Yes, only a concat.

  • 13楼
  • This code needs to be put inside Run SQL query/queries on database

    SQL QUERY WINDOW

    ALTER TABLE `table_name` CHANGE `column_name` `column_name`   VARCHAR(128) CHARACTER SET utf8 COLLATE utf8_unicode_ci NULL DEFAULT NULL;
    

    Please replace table_name and column_name with appropriate name.

相关阅读:

MySQL date format DD/MM/YYYY select query?

Multiple select statements in Single query

How to access remote server with local phpMyAdmin client?

Grant **all** privileges on database

How to get a list of MySQL user accounts

How can I prevent SQL injection in PHP?

How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?

'IF' in 'SELECT' statement - choose output value based on column values

Can't connect to local MySQL server through socket '/var/mysql/mysql.sock' (38)

How do you set a default value for a MySQL Datetime column?