Posts tagged: database

PostgreSQL, how about you don’t make migration more complex

I am currently work on building a new site, from an existing one.  A lot of the front-end work is done, now to a lot of the admin.

Today a really quite annoying “feature” of PostgreSQL raises its head and takes up 3 hours of my time.  A lot of the data from the old site PHP/MySQL has been migrated to the new one Django/PostgreSQL which all worked fine, the front end works and its all lovely.  To maintain the old database id’s we inserted them all into the new database, this in itself is not the problem.  The problem arose what starting to adding new rows, which are expected to go to the end.

INSERT INTO table (title) VALUES ('my_title');

Instead of actually insert into a new row, this started working from id 1, throwing up errors whenever the id already existed and sometimes started to delete items from the end of the database.

Just what the hell.  After plenty of research and testing, initially expecting it to be a problem with my code and the problem is found.

Adding the Primary Key manually DOES NOT update the auto-increment value, and here is the fix (change field names where necessary)

SELECT SETVAL('sequence_name', (SELECT MAX(id) FROM table_name) + 1);

If you are using Django  you can generate the SQL by running, it does not run the automatically but you can pipe the response directly into the psql tool.

python manage.py sqlsequencereset appname

Very simple but this does not follow the MySQL style I am used to, then again SQLite also does things differently which had me stumped at first.

Indexing your database fields – when and why.

This is fairly straightforward.  Indexing improves the performance of your database when doing certain things in your queries, including ORDER BY, WHERE and JOIN….among others I’m sure.

Indexing fields allows your database to reference a, well, index.  Think of it literally as the index at the back of a book.  It’s order alphabetically, and your given a page number.  The very same logic applies to the database.

Whenever you are going to ORDER by a field or any sort of lookup against a field, i.e WHERE my_field = 1; you want it indexed.  Your database will then refer to the indexes, find the rows it needs, in the correct order if necessary and in turn, returns the content for those rows.   Without indexing every row of data is checked every time, just like flicking through an entire book to find 1 paragraph.

There is caching available in some database languages, however you should not be relying on it, when a simple index is worth far more.

Different types of index

Primary, Unique, Index & Full-text.  For some databases, or even installations of MySQL, there are others available ie Spatial.

Each type has a different use.

  • Primary, most often this is an auto-incrementing id.
  • Unique, as it states data you require to be unique, username for example
  • Index, not required to be unique but you add for performance, something like the date a user signed up for displaying the latest user on your site.  A Django example is available on djangorocks.com
  • Full-text, allows full-text searching of text fields
  • Spatial, used for geometry based data to be stored and compared.  Depending on the database this is only 1 of many indexing types.

Naming your Database Tables & Fields

If you are working on your own projects, a lot of the following really does not necessarily apply…however when working with a team missing off some of these basics will not make you the most liked person. Please note, these are my own preferences, that I also utilise while working with a small team.

Reserved Words

Don’t use reserved words. Although you are able to use the back-tick to be able to query tables & fields using, using reserved words isn’t really smart. When looking over your own SQL code to debug or a problem or optimise the fields, it really doesn’t help when you see

SELECT `name`, `from` FROM `from` WHERE `from` = 'England';

A list of reserved words is available at http://dev.mysql.com/doc/mysqld-version-reference/en/mysqld-version-reference-reservedwords-5-1.html

Descriptive Names

Where possible, try to use descriptive words. When looking over the structure of your table in phpmyadmin (for example) you want to be able to know what the field or table is used for.

SELECT * FROM `category`; +----+---------+---------+-----------------------+-------------+ | id | title | slug | description | total_posts | +----+---------+---------+-----------------------+-------------+ | 1 | Example | example | A description ....... | 0 | +----+---------+---------+-----------------------+-------------+

The only real problem with this is when you come to using JOIN’s. When using PHP’s mysql_fetch_assoc() function, having identical field names across your 2 or more tables means you will only be getting 1 title. At this point you need to use AS to you can turn title into category_title and post_title. There is an example with AS in the Pluralisation section.

CamelCase, Under_scores, Hy-phens

Personally, I would use underscores. I’ll leave that one to you, however once you start, stick to it. It is rather annoying having inconsistency.

Pluralisation

This isn’t so much of a problem, mainly just a personal preference.

I name my database tables similar to as follows;

  • category
  • post
  • author

It makes more sense when doing a JOIN when reading, as follows

SELECT
    `post`.`title` AS `post_title`,
    `category`.`title` AS `category_title`
FROM `post`
LEFT JOIN `category` ON (`post`.`category_id` = `category`.`id`)
WHERE `post`.`id` = 1;

As opposed to
SELECT
    `posts`.`title` AS `post_title`,
    `categories`.`title` AS `category_title`
FROM `posts`
LEFT JOIN `categories` ON (`posts`.`category_id` = `categories`.`id`)
WHERE `posts`.`id` = 1;

In most cases you will be JOINing a single post to a single category. It also saves a few characters.

Foreign Keys

When defining foreign keys, as in the example above, when I reference the category in my post table, I use category_id. I use this both as a reference to the table I am running the JOIN statement with, as well as the field the JOIN takes place on.

Basics for designing your Database Structure

Over the next few days I will be publishing a few tutorials, covering the basics of designing your database.  I will cover both Django Model design and manually creating Database using phpmyadmin for MySQL and some of the differences between the MySQL storage engines – InnoDB & MyISAM.

Expected tutorials (I will link to these once they are running)

Dansette