Indexing your database fields – when and why.

This is fairly straightforward.  Indexing improves the performance of your database when doing certain things in your queries, including ORDER BY, WHERE and JOIN….among others I’m sure.

Indexing fields allows your database to reference a, well, index.  Think of it literally as the index at the back of a book.  It’s order alphabetically, and your given a page number.  The very same logic applies to the database.

Whenever you are going to ORDER by a field or any sort of lookup against a field, i.e WHERE my_field = 1; you want it indexed.  Your database will then refer to the indexes, find the rows it needs, in the correct order if necessary and in turn, returns the content for those rows.   Without indexing every row of data is checked every time, just like flicking through an entire book to find 1 paragraph.

There is caching available in some database languages, however you should not be relying on it, when a simple index is worth far more.

Different types of index

Primary, Unique, Index & Full-text.  For some databases, or even installations of MySQL, there are others available ie Spatial.

Each type has a different use.

  • Primary, most often this is an auto-incrementing id.
  • Unique, as it states data you require to be unique, username for example
  • Index, not required to be unique but you add for performance, something like the date a user signed up for displaying the latest user on your site.  A Django example is available on djangorocks.com
  • Full-text, allows full-text searching of text fields
  • Spatial, used for geometry based data to be stored and compared.  Depending on the database this is only 1 of many indexing types.

Naming your Database Tables & Fields

If you are working on your own projects, a lot of the following really does not necessarily apply…however when working with a team missing off some of these basics will not make you the most liked person. Please note, these are my own preferences, that I also utilise while working with a small team.

Reserved Words

Don’t use reserved words. Although you are able to use the back-tick to be able to query tables & fields using, using reserved words isn’t really smart. When looking over your own SQL code to debug or a problem or optimise the fields, it really doesn’t help when you see

SELECT `name`, `from` FROM `from` WHERE `from` = 'England';

A list of reserved words is available at http://dev.mysql.com/doc/mysqld-version-reference/en/mysqld-version-reference-reservedwords-5-1.html

Descriptive Names

Where possible, try to use descriptive words. When looking over the structure of your table in phpmyadmin (for example) you want to be able to know what the field or table is used for.

SELECT * FROM `category`; +----+---------+---------+-----------------------+-------------+ | id | title | slug | description | total_posts | +----+---------+---------+-----------------------+-------------+ | 1 | Example | example | A description ....... | 0 | +----+---------+---------+-----------------------+-------------+

The only real problem with this is when you come to using JOIN’s. When using PHP’s mysql_fetch_assoc() function, having identical field names across your 2 or more tables means you will only be getting 1 title. At this point you need to use AS to you can turn title into category_title and post_title. There is an example with AS in the Pluralisation section.

CamelCase, Under_scores, Hy-phens

Personally, I would use underscores. I’ll leave that one to you, however once you start, stick to it. It is rather annoying having inconsistency.

Pluralisation

This isn’t so much of a problem, mainly just a personal preference.

I name my database tables similar to as follows;

  • category
  • post
  • author

It makes more sense when doing a JOIN when reading, as follows

SELECT
    `post`.`title` AS `post_title`,
    `category`.`title` AS `category_title`
FROM `post`
LEFT JOIN `category` ON (`post`.`category_id` = `category`.`id`)
WHERE `post`.`id` = 1;

As opposed to
SELECT
    `posts`.`title` AS `post_title`,
    `categories`.`title` AS `category_title`
FROM `posts`
LEFT JOIN `categories` ON (`posts`.`category_id` = `categories`.`id`)
WHERE `posts`.`id` = 1;

In most cases you will be JOINing a single post to a single category. It also saves a few characters.

Foreign Keys

When defining foreign keys, as in the example above, when I reference the category in my post table, I use category_id. I use this both as a reference to the table I am running the JOIN statement with, as well as the field the JOIN takes place on.

Basics for designing your Database Structure

Over the next few days I will be publishing a few tutorials, covering the basics of designing your database.  I will cover both Django Model design and manually creating Database using phpmyadmin for MySQL and some of the differences between the MySQL storage engines – InnoDB & MyISAM.

Expected tutorials (I will link to these once they are running)

Deploying a standalone DNS Server using PowerDNS

In the next few weeks I will be deploying a new DNS server, standalone, with a custom control panel.

It is using PowerDNS.  The panel is written in PHP and located on the webserver, not the new DNS box.  The only non-DNS service I am running is MySQL, this is where PowerDNS gets its zones from.

I have have started work on coding the panel to also support slave DNS servers, however not using “notify” built into DNS servers, rather using the panel to create Master & Alias servers.  This just replicates the data across more than one server.  Much simpler to setup and all of the software exists in standard debian repositories (I think, if not I will also be adding details on adding new ones to the list)

Please do be aware though, I will not necessarily be releasing all of the source code for this, however there will be tutorials on writing it yourself, as well as a few scripts & commands to run to get your current BIND based zone files into the MySQL database of your new setup.

Facebook Connect

Recently I started writing facebook connect, from scratch, into one of the sites I’m working on.  Its actually a very easy process.  The (new?) javascript library is very good, setting cookies for authentication etc.

After actually having authenticated myself within my own site (facebook side) there is still quite a lot to do.

  • Creating / Updating accounts  – Uses the REST api.  I’ve got the account details, just need to use them
  • Authenticating the user automatically using the standard Django auth.
  • Removing accounts (or something if people remove my app from their list
  • A cron to update the accounts every 24 hours, facebook caching requirements

A few side things to still be done (maybe) wall posts, friend invites and I’m slightly tempted to use the facebook comments system, however…I am not entirely sure of the benefit of doing that rather than creating my own (which I have elsewhere) or using the standard Django comments framework.

This facebook connect stuff will also be part of the tutorials in future as well.

Dansette