MYSQL TO POSTGRESQL


Converting MySQL to PostgreSQL



Very Short Intro[edit]

You may have read a bunch of short articles with the same name on the web, but they were just snippets of information you needed. It's time to put it all together.

You have a project in MySQL and suddenly you find out that you need to switch to PostgreSQL. Suddenly you see that there are many flavours of SQL and that your seemingly basic constructions throw a lot of errors. You don't have time to really rewrite your code from scratch, it may come later...

Actually, there may be some good reasons to switch...

With PostgreSQL you may still feel a little like a second-class citizen, but not really the ignored one. There are some major projects like AsteriskHorde or DBMail that have recognized its qualities and although MySQL was their first choice database, they are showing effort to make things run here too.

Check The Server Running[edit]

Most likely you don't need this chapter, but very briefly: after you've installed your package with PostgreSQL on your Linux machine (be it from a package or following these notes), you need to do something like

su -
su - postgres
createdb test
psql test
=# create user username password ' password ';
-- To change a password:
=# alter role username password ' password ';
=# create database databasename with encoding 'utf8';
=# grant all privileges on database databasename to username;
=# \l
=# \c databasename =# \q

vi /etc/postgresql/pg_hba.conf

host    all         all         0.0.0.0           0.0.0.0            password

be SURE to cover this security issue with iptables!

/etc/init.d/postgresql reload or /usr/lib/postgresql/bin/pg_ctl reload

postmaster successfully signaled

psql -h server -d databasename -U username

databasename=>

Convert and Import[edit]

Common way with SQL dump[edit]

Dump your tables with

mysqldump -u username -p --compatible=postgresql databasename > outputfile.sql

but even then you will have to change escaped chars (replacing \t with ^I, \n with ^M, single quote (') with doubled single quote and double (escaped) backslash (\\) with a single backslash). This can't be trivially done with sedcommand, you may need to write a script for it (RubyPerl, etc). It is much better and proven solution to prepend your dump with the following lines

SET standard_conforming_strings = 'off';
SET backslash_quote = 'on';

These options will force PostgreSQL parser to accept non-ANSI-SQL-compatible escape sequences (Postgre will still issue HINTs on it; you can safely ignore them). Do not set these options globally: this may compromise security of the server!

You also have to manually modify the data types etc. as discussed later.

After you convert your tables, import them the same way you were used to in MySQL, that is

psql -h server -d databasename -U username -f data.sql

Export using CSV-files[edit]

When you have a large sql dump and a binary data inside, it will be uneasy to modify the data structure, so there is another way to export your data to PostgreSQL. Mysql have an option to export each tables from database as separate .sql file with table structure and .txt file with table's data in CSV-format:

mysqldump -U username -p -T/path/to/export databasename

Notice that /path/to/export should be writeable by user who runs mysqld, in most case it mysqld. After that you should modify your table structure according PostgreSQL format:

  • convert data types
  • create separate keys definitions
  • replace escape characters

When table structure will be ready, you should load it as it was shown above. You should prepare data files: replace carriage return characters to "\r" and remove invalid characters for your data encoding. Here is an example bash script how you can do this and load all the data in your database:

#!/bin/bash

CHARSET="utf-8" #your current database charset
DATADIR="/path/to/export"
DBNAME="databasename"

for file in `ls -1 $DATADIR/*.txt`; do
  TMP=${file%.*}
  TABLE=${TMP##*/}
  echo "preparing $TABLE"

  #replace carriage return
  sed 's/\r/\\r/g' $file > /tmp/$TABLE.export.tmp

  #cleanup non-printable and wrong sequences for current charset
  iconv -t $CHARSET -f $CHARSET -c < /tmp/$TABLE.export.tmp > /tmp/$TABLE.export.tmp.out

  echo "loading $TABLE"
  /usr/bin/psql $DBNAME -c "copy $TABLE from '/tmp/$TABLE.export.tmp.out'"

  #clean up
  rm /tmp/$TABLE.export.tmp /tmp/$TABLE.export.tmp.out
done

The Environment[edit]

Perl[edit]

You will need to install an appropriate DBD package. In Debian/Ubuntu run apt-get install libdbd-pg-perl.

Changing The Code Quick And Dirty[edit]

Perl[edit]

MySQL PostgreSQL comments
$db=DBI->connect("dbi:mysql:database= ... )
$db=DBI->connect("dbi:Pg:database= ... )
All you have to do is changing mysql to Pg. Beware the case sensitivity.

SQL[edit]

Syntax[edit]

MySQL PostgreSQL comments
#
--
MySQL accepts nonstandard # to begin a comment line; PostgreSQL uses ANSI standard double dash; use the ANSI standard, both databases understand it. (However, MySQL requires a space after --, whilst it is not mandatory in PostgreSQL)
' " vs. `
' vs. "
MySQL uses ' or " to quote values (i.e. WHERE name = "John"). This is not the ANSI standard for databases. PostgreSQL uses only single quotes for this (i.e.WHERE name = 'John'). Double quotes are used to quote system identifiers; field names, table names, etc. (i.e. WHERE "last name" = 'Smith'). MySQL uses ` (accent mark or backtick) to quote system identifiers, which is decidedly non-standard. Note: you can make MySQL interpret quotes like PostgreSQL using SET sql_mode='ANSI_QUOTES'.
... WHERE lastname="smith"
... WHERE lower(lastname)='smith'
PostgreSQL is case-sensitive for string comparisons. The value 'Smith' is not the same as 'smith'. This is a big change for many users from MySQL (in MySQL, VARCHAR and TEXT columns are case-insensitive unless the "binary" flag is set) and other small database systems, like Microsoft Access. In PostgreSQL, you can either:
  • Use the correct case in your query. (i.e. WHERE lastname='Smith')
  • Use a conversion function, like lower() to search. (i.e. WHERE lower(lastname)='smith')
  • Use a case-insensitive operator, like ILIKE or *~
`LastName` = `lastname`

and maybe not?

"LastName" <> "lastname"
Database, table, field and columns names in PostgreSQL are case-independent, unless you created them with double-quotes around their name, in which case they are case-sensitive. In MySQL, table names can be case-sensitive or not, depending on which operating system you are using.
Note that PostgreSQL actively converts all non-quoted names to lower case and so returns lower case in query results!
'foo' || 'bar'
means OR
'foo' || 'bar'
means string concatenation (= 'foobar')
MySQL accepts C-language operators for logic, SQL requires AND, OR; use the SQL standard keywords for logic, both databases understand it.

Data Types[edit]

The ideas for this table were partially derived from automated dump converting script [1]. Official documentation:

List of available data types can be reached also by using psql's internal slash command \dT.

MySQL PostgreSQL ANSI Standard SQL comments
TINYINT
SMALLINT
MEDIUMINT
BIGINT
SMALLINT
SMALLINT
INTEGER
BIGINT
INTEGER
INTEGER
INTEGER
NUMERIC(20)
see [2]; integer size in PostgreSQL is 4 Bytes signed (-2147483648 – +2147483647)
TINYINT UNSIGNED
SMALLINT UNSIGNED
MEDIUMINT UNSIGNED
INT UNSIGNED
BIGINT UNSIGNED
SMALLINT
INTEGER
INTEGER
BIGINT
NUMERIC(20)
INTEGER
INTEGER
INTEGER
NUMERIC(10)
NUMERIC(20)
SQL doesn't know UNSIGNED, all numbers are signed.
FLOAT
FLOAT UNSIGNED
REAL
REAL
FLOAT4
FLOAT4
 
DOUBLE
DOUBLE PRECISION
FLOAT8
 
BOOLEAN
BOOLEAN
BOOLEAN
MySQL Booleans are an alias for TINYINT(1); PostgreSQL doesn't auto-convert numbers into booleans.
TINYTEXT
TEXT
MEDIUMTEXT
LONGTEXT
TEXT
TEXT
TEXT
TEXT
TEXT
TEXT
TEXT
TEXT
 
BINARY(n)
VARBINARY(n)
TINYBLOB
BLOB
MEDIUMBLOB
LONGBLOB
BYTEA
BYTEA
BYTEA
BYTEA
BYTEA
BYTEA
BIT(n)
BIT VARYING(n)
TEXT
TEXT
TEXT
TEXT
 
ZEROFILL
not available
not available
 
DATE
TIME
DATETIME
TIMESTAMP
DATE
TIME [WITHOUT TIME ZONE]
TIMESTAMP [WITHOUT TIME ZONE]
TIMESTAMP [WITHOUT TIME ZONE]
DATE
TIME
TIMESTAMP
TIMESTAMP
 
column SERIAL

equals to:

column BIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE

or:

column INT DEFAULT SERIAL

equals to:

column INT NOT NULL AUTO_INCREMENT UNIQUE
column SERIAL

equals to:

CREATE SEQUENCE name;
CREATE TABLE table ( column INTEGER NOT NULL
        DEFAULT nextval(name)
);
column GENERATED BY DEFAULT
Note for PostgresSQL:

SERIAL = 1 – 2147483647
BIGSERIAL = 1 – 9223372036854775807

SERIAL is in fact an entity named SEQUENCE. It exists independently on the rest of your table. If you want to cleanup your system after dropping a table, you also have to DROP SEQUENCE nameMore on that topic...


Note for MySQL:

column SERIAL PRIMARY KEY

or

column SERIAL,
PRIMARY KEY(column)

Will result in having 2 indexes for column. One will be generated by thePRIMARY KEY constraint, and one by the implicit UNIQUE constraint present in the SERIAL alias. This has been reported as a bug and might be corrected.

column ENUM (value1, value2, [...])
column VARCHAR(255) NOT NULL,
CHECK (column IN (value1, value2, [...]))

or

CREATE TYPE mood AS ENUM ('sad','ok','happy');
CREATE TABLE person ( current_mood mood ... )
column VARCHAR(255) NOT NULL,
CHECK (column IN (value1, value2, [...]))
PostgreSQL doesn't have the ENUM types prior to 8.3, so you need to simulate it with contraints when using < 8.3.

Language Constructs[edit]


MySQL PostgreSQL comments
DESCRIBE table 
Using psql:
\d table 

or

SELECT
        a.attname AS Field,
        t.typname || '(' || a.atttypmod || ')' AS Type,
        CASE WHEN a.attnotnull = 't' THEN 'YES' ELSE 'NO' END AS Null,
        CASE WHEN r.contype = 'p' THEN 'PRI' ELSE '' END AS Key,
        (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid), '\'(.*)\'')
                FROM
                        pg_catalog.pg_attrdef d
                WHERE
                        d.adrelid = a.attrelid
                        AND d.adnum = a.attnum
                        AND a.atthasdef) AS Default,
        '' as Extras
FROM
        pg_class c 
        JOIN pg_attribute a ON a.attrelid = c.oid
        JOIN pg_type t ON a.atttypid = t.oid
        LEFT JOIN pg_catalog.pg_constraint r ON c.oid = r.conrelid 
                AND r.conname = a.attname
WHERE
        c.relname = 'tablename'
        AND a.attnum > 0
        
ORDER BY a.attnum
PostgreSQL doesn't implement an SQL extension; it uses psql's internal slash command instead. (Be careful: in the mysql client, \d is shorthand for DROP TABLE)
DROP TABLE IF EXISTS table 
DROP TABLE IF EXISTS table 
IF EXISTS in DROP TABLE clause only available since PostgreSQL 8.2.
REPLACE [INTO] table [(column, [...])] VALUES (value, [...])

or

INSERT INTO table (column1, column2, [...])
VALUES
        (value1, value2, [...])
ON DUPLICATE KEY UPDATE column1 = value1, column2 = value2 
CREATE FUNCTION someplpgsqlfunction() RETURNS void AS $$
BEGIN
        IF EXISTS( SELECT * FROM phonebook WHERE name = 'john doe' ) THEN
                UPDATE phonebook
                SET extension = '1234' WHERE name = 'john doe';
        ELSE
                INSERT INTO phonebook VALUES( 'john doe', '1234' );
        END IF;

        RETURN;
END;
$$ LANGUAGE plpgsql;
PostgreSQL doesn't implement REPLACE SQL extension. The presented solution uses PL/pgSQL.

(Note: MySQL REPLACE INTO deletes the old row and inserts the new, instead of updating in-place.)

SELECT ... INTO OUTFILE '/var/tmp/outfile'
COPY ( SELECT ... ) TO '/var/tmp/outfile'
 
SHOW DATABASES
Run psql with -l parameter

or using psql:

\l

or

SELECT datname AS Database FROM pg_database
        WHERE datistemplate = 'f'
PostgreSQL doesn't implement an SQL extension.
SHOW TABLES
Using psql:
\dt

or

SELECT c.relname AS Tables_in FROM pg_catalog.pg_class c
        LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE pg_catalog.pg_table_is_visible(c.oid)
        AND c.relkind = 'r'
        AND relname NOT LIKE 'pg_%'
ORDER BY 1
PostgreSQL doesn't implement an SQL extension; it uses psql's internal slash command instead.
SELECT ... LIMIT offset, limit 

or

SELECT ... LIMIT limit OFFSET offset 
SELECT ... LIMIT limit OFFSET offset 
 
CREATE TABLE table ( column ... ,
        {INDEX|KEY} [name] (column, [...])
)

or

CREATE INDEX name ON table (column, [...])
CREATE INDEX name ON table (column, [...])
 
USE database ;
Using psql:
\c database 
 
UNLOCK TABLES;
-- nothing
"There is no UNLOCK TABLE command; locks are always released at transaction end." ( http://www.postgresql.org/docs/8.1/static/sql-lock.html )

Functions[edit]

MySQL PostgreSQL comments
LAST_INSERT_ID() CURRVAL('serial_variable') NOTE: it is not only "subsitute string" solution as you need to know the name of SERIAL variable (unlike AUTO_INCREMENT in MySQL). Also note that PostgreSQL can play with the OID of the last row inserted by the most recent SQL command.

NOTE2: Even better way to replace LAST_INSERT_ID() is creating a rule, because this cannot suffer from race-conditions:

CREATE RULE get_{table}_id_seq AS ON INSERT TO {table} DO SELECT currval('{table}_id_seq'::text) AS id;

(usage is somehow strange, you get a result from an INSERT-statement, but it works very well)

NOTE3: Another, more readable way:

INSERT INTO mytable VALUES (...) RETURNING my_serial_column_name;

Common Errors[edit]

  • ERROR: relation "something" does not exist - usually table doesn't exist as you probably didn't make it with the new datatypes or syntax. Also watch out for case folding issues; PostgreSQL = postgresql != "PostgreSQL".
  • prepared statement "dbdpg_X" does not exist -

PL/pgSQL[edit]

Install[edit]

In versions prior to 9.0, you have to make it available explicitly for every database:

your_unix$ su - postgres
your_unix$ .../pgsql/bin/createlang plpgsql -h localhost -d databasename 

Running A Function[edit]

SELECT definedfunction();

Administration[edit]

To use the same backup technique as used with MySQL, in /etc/logrotate.d/postgresql-dumps:

/dumps/postgresql/*/*.dump.gz {
        daily
        rotate 20
        dateext
        nocompress
        sharedscripts
        create
        postrotate
                for i in $(su - postgres -c "psql --list -t" | awk '{print $1}' | grep -vE '^$|^template[0-9]'); do
                        if [ ! -e /dumps/postgresql/$i ]; then mkdir -m 700 /dumps/postgresql/$i; fi
                        # compress even in custom format, because it can be compressed more
                        su - postgres -c "pg_dump --format=custom $i" | gzip > /dumps/postgresql/$i/$i.dump.gz
                done
        endscript
}
/dumps/postgresql/*/*.sql.gz {
        daily
        rotate 20
        dateext
        nocompress
        sharedscripts
        create
        postrotate
                for i in $(su - postgres -c "psql --list -t" | awk '{print $1}' | grep -vE '^$|^template[0-9]'); do
                        if [ ! -e /dumps/postgresql/$i ]; then mkdir -m 700 /dumps/postgresql/$i; fi
                        su - postgres -c "pg_dump --format=plain $i" | gzip > /dumps/postgresql/$i/$i.sql.gz
                done
        endscript
}
/dumps/postgresql/*/*.tar.gz {
        daily
        rotate 20
        dateext
        nocompress
        sharedscripts
        create
        postrotate
                for i in $(su - postgres -c "psql --list -t" | awk '{print $1}' | grep -vE '^$|^template[0-9]'); do
                        if [ ! -e /dumps/postgresql/$i ]; then mkdir -m 700 /dumps/postgresql/$i; fi
                        su - postgres -c "pg_dump --format=tar $i" | gzip > /dumps/postgresql/$i/$i.tar.gz
                done
        endscript
}


2,

Things to find out about when moving from MySQL to PostgreSQL

Last updated 8th April 2001

MySQL might be useful for the most complicated database project in the universe, and might be scalable to counting the grains of sand on every beach. However, its support for many 'high-end' database features is scarce (to its credit, in its place is an extremely easy-to-install, easy-to-administer, relatively speedy and painless database that works great on practically every OS you'd care to use.)

Some Differences Between PostgreSQL + MySQL

In general, PostgreSQL makes a strong effort to conform to existing database standards, where MySQL has a mixed background on this. If you're coming from a background using MySQL or Microsoft Access, some of the changes can seem strange (such as not using double quotes to quote string values).

  • MySQL uses nonstandard '#' to begin a comment line; PostgreSQL doesn't. Instead, use '--' (double dash), as this is the ANSI standard, and both databases understand it.
  • MySQL uses ' or " to quote values (i.e. WHERE name = "John"). This is not the ANSI standard for databases. PostgreSQL uses only single quotes for this (i.e. WHERE name = 'John'). Double quotes are used to quote system identifiers; field names, table names, etc. (i.e. WHERE "last name" = 'Smith').
  • MySQL uses ` (accent mark or backtick) to quote system identifiers, which is decidedly non-standard.
  • PostgreSQL is case-sensitive for string comparisons. The field "Smith" is not the same as the field "smith". This is a big change for many users from MySQL and other small database systems, like Microsoft Access. In PostgreSQL, you can either:
    • Use the correct case in your query. (i.e. WHERE lname='Smith')
    • Use a conversion function, like lower() to search. (i.e. WHERE lower(lname)='smith')
    • Use a case-insensitive operator, like ILIKE or ~*
  • Database, table, field and columns names in PostgreSQL are case-independent, unless you created them with double-quotes around their name, in which case they are case-sensitive. In MySQL, table names can be case-sensitive or not, depending on which operating system you are using.
  • PostgreSQL and MySQL seem to differ most in handling of dates, and the names of functions that handle dates.
  • MySQL uses C-language operators for logic (i.e. 'foo' || 'bar' means 'foo' OR 'bar', 'foo' && 'bar' means 'foo' and 'bar'). This might be marginally helpful for C programmers, but violates database standards and rules in a significant way. PostgreSQL, following the standard, uses || for string concatenation ('foo' || 'bar' = 'foobar').
  • There are other differences between the two, such as the names of functions for finding the current user. MySQL has a tool, Crash-Me, which can useful for digging this out. (Ostensibly, Crash-Me is a comparison tool for databases; however, it tends to seriously downplay MySQL's deficiencies, and isn't very objective in what it lists: the entire idea of having procedural languages (a very important feature for many users!) is relegated to a single line on the bottom fifth of the document, while the fact that MySQL allows you to use || for logical-or (definitely non-standard), is listed way before this, as a feature. Be careful about its interpretations.)

The Larger Picture

The important things (for me, at least) are more than just 'how do I do this MySQL thing in PostgreSQL?', but 'is there a much better way to think about this, which MySQL never even supported?'

For example:

Imagine you're building a dynamic website for human resources. You want to list every current senior staff member's name, some info about them, and a list of their goals.

With MySQL, you'd do something like this:

(this is generic pseudo-code, it would easily translate to PHP, Zope, EmbPerl, etc.)

 
     

$firstname $lastname

  • $goalinfo

That's great, and it works fine. You can easily translate this to PostgreSQL.

Would you want to, though? PostgreSQL has many features MySQL doesn't, like:

For instance, rather than coding in the web front end the logic of is-not-fired and is-senior-staff, in PostgreSQL, I'd make a VIEW of all staff for which we want to show goals:

CREATE VIEW staff_having_goals AS
SELECT staffid, firstname || lastname as fullname
FROM Staff
WHERE datefired ISNULL and seniorstaff = TRUE
ORDER BY lastname, firstname

Now, my web programming doesn't have to worry about the lower level concerns. Imagine if this same list of people and goals appeared dozens of times on your site -- I've moved from having this scattered in many places, to having it encapsulated in one place.

PostgreSQL also allows procedural languages (perl, tcl, python, and an Oracle-alike, PL/pgSQL). These allow you to create functions in your database (and even non-sysadmins can use them, as the functions fit in the PostgreSQL security model).

(Yes, MySQL has user functions, which last time I checked, had to be written in C, and linked into the database. A nice feature, to be sure, but VERY different from having high-level procedural languages usable w/o root privileges!)

We might use these procedural languages to create lists, handle database events (if a record is added here, automatically track this there, and so on. You might have a function to calculate a staff member's hourly compensation from their salary, which, IMHO, *should* be a database function, not a function coded for every different web project or front-end project you had.)

PostgreSQL also has transactions, which can remove some of the hairy if-an-error-happened-back-out-all-database-work code. (MySQL, to its credit, has transactions in their new MaxSQL thingie.)

In addition, PostgreSQL supports many standard parts of SQL that MySQL doesn't, such as subqueries, unions, intersections, etc. While you can often program around these, either with more SQL, or more logic in the front-end, the best (fastest, more portable, most abstracted) solution is to integrate this thinking into your query writing and database design.

So:

The things that are handled differently are fairly small, and can generally be handled without too much pain. Especially since you can easily create PostgreSQL user functions that mimic any from MySQL.

The real lesson is to learn about what features PostgreSQL has and figure out *why* to use them!

I'd start w/the five above (views, procedural languages, triggers, customizable aggregates, transactions) and make sure that you understand exactly what they are, how to use them, and how wonderful they are.

I hope this helps. I moved to PostgreSQL from using MySQL, and for several months after first playing with it, I just thought it was a bigger, more complicated database that did 'the same stuff' as MySQL. It took me a while to really realize how great the 'other' features are.

Good luck!

Joel Burton Director of Information Systems Support Center of Washington


3,

https://confluence.atlassian.com/display/JIRA/Switching+Databases#SwitchingDatabases-differenttype


Migrating JIRA's data to a different type of database server

Use this procedure to migrate JIRA's data to a different type of database server (e.g. from a MySQL server to a PostgreSQL server).

(tick) You can also use this procedure if your JIRA installation is currently using the internal HSQL database (which is only supported for evaluating JIRA) and you need to switch your JIRA installation across to using a  supported database (which are supported for JIRA installations used in a production environment).

  1. Create an export of your data as an XML backup. See Backing Up Data for details.
    (warning) Please note that JIRA's XML backup utility does not back up attachments (if you have attachments enabled).
  2. Create a new database on your new database server to house JIRA's data. See the appropriate database configuration guide in the Connecting JIRA to a Database section for the database creation instructions.
  3. Shut down your JIRA server.
  4. Make a backup of your JIRA Home Directory and JIRA Installation Directory.
  5. Delete the dbconfig.xml file in your JIRA Home Directory.
  6. Restart JIRA and you should see the first step of the JIRA Setup Wizard for configuring your database connection.
  7. Configure JIRA's connection to your new database (created in step 2 above) and click the 'Next' button.
  8. On the 'Application Properties' setup page, click the 'import your existing data' link and restore your data from the XML backup created in step 1 above.
请使用浏览器的分享功能分享到微信等