Posts tagged: database

PostgreSQL, how about you don’t make migrating from MySQL more complicated?

I am currently working on building a new site which is migrating from an existing one.  Front-end work mostly complete, now to creating the admin.

The Problem

Today, a really quite annoying “feature” of PostgreSQL raises its head and takes up far too much time to resolve.  For some background, the data is from an old site built in PHP & MySQL, being migrated to Django & PostgreSQL. This has been working fine, the frontend works and that’s great.

To maintain the old database id’s we inserted everything en-mass into the new database. This in itself was not the problem.  The problem arose when starting to adding new row and are expected auto-increment to the end.

Just a simple query, nothing overly complicated here.

INSERT INTO table (title) VALUES ('my_title');

However, instead of actually inserting into a new row at the end, it started working from id 1. This throws up errors whenever the id already existed (this differs from MySQL). Worse still, on some occasions started to delete items from the end of the database.

The Solution

So, what’s going on? I was initially expecting it to be a problem with my code although but, after plenty of testing, that wasn’t the case.

The primary key wasn’t updating the auto-increment value as expected. Here was the simple fix (change field names where necessary). This will reset the internal counter to the next available number.

SELECT setval('"table_id_seq"', coalesce(max("id"), 1), max("id") IS NOT null) FROM "table";

If you are using Django  you can generate the SQL by running, it does not run the automatically but you can pipe the response directly into the psql tool.

python manage.py sqlsequencereset appname

Very simple, but this does not follow the MySQL (less control over increments and automatically finding the next available number). Then again SQLite also does things differently and no doubt others as well.

It really pays to be aware of some of the fundamental, often unexpected, differences between databases and frameworks. Embrace change, don’t fear it. Do your research, is it better?, does it make for a logical change? But, don’t embrace unnecessarily.

Indexing your database fields – when and why.

This is fairly straightforward.  Indexing improves the performance of your database when doing certain things in your queries, including ORDER BY, WHERE and JOIN….among others I’m sure.

Indexing fields allows your database to reference a, well, index.  Think of it literally as the index at the back of a book.  It’s order alphabetically, and your given a page number.  The very same logic applies to the database.

Whenever you are going to ORDER by a field or any sort of lookup against a field, i.e WHERE my_field = 1; you want it indexed.  Your database will then refer to the indexes, find the rows it needs, in the correct order if necessary and in turn, returns the content for those rows.   Without indexing every row of data is checked every time, just like flicking through an entire book to find 1 paragraph.

There is caching available in some database languages, however you should not be relying on it, when a simple index is worth far more.

Different types of index

Primary, Unique, Index & Full-text.  For some databases, or even installations of MySQL, there are others available ie Spatial.

Each type has a different use.

  • Primary, most often this is an auto-incrementing id.
  • Unique, as it states data you require to be unique, username for example
  • Index, not required to be unique but you add for performance, something like the date a user signed up for displaying the latest user on your site.  A Django example is available on djangorocks.com
  • Full-text, allows full-text searching of text fields
  • Spatial, used for geometry based data to be stored and compared.  Depending on the database this is only 1 of many indexing types.

Naming your Database Tables & Fields


If you are working on your own projects, a lot of the following really does not necessarily apply…however when working with a team missing off some of these basics will not make you the most liked person. Please note, these are my own preferences, that I also utilise while working with a small team.

Reserved Words

Don’t use reserved words. Although you are able to use the back-tick to be able to query tables & fields using, using reserved words isn’t really smart. When looking over your own SQL code to debug or a problem or optimise the fields, it really doesn’t help when you see

SELECT `name`, `from` FROM `from` WHERE `from` = 'England';

A list of reserved words is available at http://dev.mysql.com/doc/mysqld-version-reference/en/mysqld-version-reference-reservedwords-5-1.html

Descriptive Names

Where possible, try to use descriptive words. When looking over the structure of your table in phpmyadmin (for example) you want to be able to know what the field or table is used for.

SELECT * FROM `category`; +----+---------+---------+-----------------------+-------------+ | id | title | slug | description | total_posts | +----+---------+---------+-----------------------+-------------+ | 1 | Example | example | A description ....... | 0 | +----+---------+---------+-----------------------+-------------+

The only real problem with this is when you come to using JOIN’s. When using PHP’s mysql_fetch_assoc() function, having identical field names across your 2 or more tables means you will only be getting 1 title. At this point you need to use AS to you can turn title into category_title and post_title. There is an example with AS in the Pluralisation section.

CamelCase, Under_scores, Hy-phens

Personally, I would use underscores. I’ll leave that one to you, however once you start, stick to it. It is rather annoying having inconsistency.

Pluralisation

This isn’t so much of a problem, mainly just a personal preference.

I name my database tables similar to as follows;

  • category
  • post
  • author

It makes more sense when doing a JOIN when reading, as follows

SELECT
    `post`.`title` AS `post_title`,
    `category`.`title` AS `category_title`
FROM `post`
LEFT JOIN `category` ON (`post`.`category_id` = `category`.`id`)
WHERE `post`.`id` = 1;

As opposed to

SELECT
    `posts`.`title` AS `post_title`,
    `categories`.`title` AS `category_title`
FROM `posts`
LEFT JOIN `categories` ON (`posts`.`category_id` = `categories`.`id`)
WHERE `posts`.`id` = 1;

In most cases you will be JOINing a single post to a single category. It also saves a few characters.

Foreign Keys

When defining foreign keys, as in the example above, when I reference the category in my post table, I use category_id. I use this both as a reference to the table I am running the JOIN statement with, as well as the field the JOIN takes place on.

Basics for designing your Database Structure


Over the next few days I will be publishing a few tutorials, covering the basics of designing your database.  I will cover both Django Model design and manually creating Database using phpmyadmin for MySQL and some of the differences between the MySQL storage engines – InnoDB & MyISAM.

Expected tutorials (I will link to these once they are running)

Dansette