Friday, May 31, 2013

How to backup a working Firebird database using a third party backup tool.


One of the basic problems of trying to use a third party backup tool, or simple file copy of a Firebird database whilst a database is active and online, is that the tool or utility has no concept of transactions, so all you get is a copy of what is in the O/S buffer or disk of the database at the time you make the backup.  However this "copy" of the database might be changing as new or active transactions are committed to the database while the copy is taking place. This is likely to produce at best, an inconsistent database, or at worst something that is corrupt and can't be used. Prior to Firebird 2.0 (other than using gbak) the only way you could do a backup or copy like this was to shut the database down, and make sure that no users were accessing the database before you invoked the third party backup tool or copy.

However it is possible to use Nbackup to achieve the functional equivalent of a gbak and use a third party backup tool.

The first thing you need to do is start a "freeze" on your database using the following syntax.

nbackup -U username -P password -L database.fdb

This will effectively lock the database, a flag is placed on the database header page, and it is set to "Locked" to let the engine know that all amended database pages that are written to the database are now being redirected to a delta file.

Changes are flushed from the internal (Firebird) database cache to the O/S cache when a transaction is committed, if forced writes are on then these changes are flushed directly to disk, the final task on commit is to mark the transaction as committed in the Transaction Inventory Page. Once the database is locked, all commits are written to the delta file rather than the database, thus ensuring that the database is kept in a consistent state.

Once the lock is applied, a simple gstat -h on the database wil show the "LOCKED" status as an optional database attribute. Once the -L command has done its work, a dela file will now be capable of receiving any committed changed database pages.

You can now use your an alternative backup tool whilst database users continue to work. When your backup tool has finished doing what it needs to do to take a copy of the database, you can "unfreeze" the database using the nbackup -N (unlock) command.

nbackup -U username -P password -N database.fdb

The unfreeze causes nabckup to merge the changed pages from the delta file back into the main database, when completed the delta file is removed and the database header is changed back to its normal state.
The backup you made of the database in its frozen state will still be in a "LOCKED" state, so if you need to restore it users will be unable to attach to it until you perform a "fixup". The fixup will reset the locked flag on the database header page back to normal, even though there isn't a delta file associated with the database.

Note: If you are going to use this capability, please make sure that you are using the latest version of Firebird, as a number of bugs in nbackup have been fixed since its original release.

Thursday, May 23, 2013

CORE-4100


An attempt to explain the rationale behind Vlads fix for CORE-4100.

When a sweep has successfully finished its work it advances the Oldest Interesting Transaction (OIT) up to the value of the Oldest Snapshot Transaction (OST) that was recorded when the sweep started. The OIT transaction is the first transaction in a state other than committed in the database’s Transaction Inventory Pages (TIP), while the OST is the oldest transaction that was started in Snapshot mode.

If while the sweep is running, there are more new transactions started than is the sweep interval (by default, when the OIT is 20,001 transactions less then the Oldest Snapshot Transaction), it is possible that the new OIT value could again be more than the sweep interval less the OST value, thus ensuring that a new sweep could start immediately.

After a sweep has completed the first new transaction will pick up the updated OIT value from the saved OST on the database header page that was recorded when the sweep began, it will also read the actual OST from the header page, as well as the Oldest Active Transaction (OAT), the first transaction marked as active in the TIP pages. If the sweep condition is then met, a sweep begins.

Ideally what should happen is that the transaction should pick up the recalculated OIT from (in transaction) rather than the OIT from the header page in order to determine whether a sweep should start or not.

An example:

1. Transaction 1000 was rolled back, therefore the next transaction when it calculates the OIT will use 1000 and is now considered a stuck or “interesting” value.

2. By the time transaction 21001 occurs we have the following numbers:
OIT     1000
OST     21000,
Next     21001

3. An automatic  sweep is started, and it makes a note that the OST is 21000

4. While the sweep is running 30000  new transactions get started and committed.

5. When the sweep has finished doing its garbage collection and is about to advance the OIT, the numbers on the database header page are in effect
OIT     1000
OST     51000
Next    51001

6. The sweep then advances the OIT up to previously noted OST (21000)

7. A new transaction is started and it then obtains the following numbers from the database header page:
tra_oldest (OIT)            21000
tra_oldest_active (OAT)        51002
tra_number (OST)        51002
 
However within the transaction it has also recalculated the new oldest interesting transaction number as 51001 which will be written to the database header page at the end of the transaction.

8. However based on the OIT read from the database header page the following condition below is true

tra_oldest_active (OAT) - tra_oldest (OIT) > sweep_interval
51002 – 21000 > 20000 therefore a sweep will be started again.

9. However when sweep starts the database header page will have been updated to contain an OIT of 51001, so instead of doing the above, we really should check the local OIT that is going to be written out to the header page rather than the header page itself, before deciding on whether to do a sweep or not.