Paul Beach's Blog: 2010

Monday, September 6, 2010

Promenade - Chezelles - 2010

Every first sunday afternoon at 2.00pm in September (except for the year it was cancelled due to a clash with an agricultural show), there is a "promenade gastronomique" (a walk where you taste local produce) at Chezelles. The walk this Sunday was 9km (thankfully because the thermometer was showing 29 degress centigrade in the shade).

On the walk, we tasted local wines, goats cheese, jams, honey products and beer, and had a great time.

I have never seen this "walk" advertised, but its very popular. If you are in the area next year at the same time, then it might be worth having a go.

Cost this year was 4 Euros per person.

Thursday, September 2, 2010

Stirling EH924 25th/26th July 1943 Essen

Its been a while since I posted anything to blog. There are no major reasons as to why I haven't posted, just haven't had anything major or interesting to say.

However while doing research about my Grandfather's RAF career I came across this comment in the 51 Squadron Operations Record book for the night of the 25th/26th July 1943.

Details of Sortie on Flight
.... "An aircraft flying immediately behind this Halifax was shot down in the Goch area. Halifax itself however returned to base without incident."

I was curious as to which aircraft was shot down, and a web search revealed the following on lostbombers.co.uk. The data on that site being taken from William Chorley's excellent books about RAF Bomber Command's losses during the Second World War. Note the information on lostbombers.co.uk is out of date and was compiled from William Chorley's books without permission from William or his publishers.

Stirling EH924 Information
Type Stirling
Serial Number EH924
Squadron 620
X1D QS-B
Operation Essen
Date 1 25th July 1943
Date 2 26th July 1943

Further Information

"Serial Range EH921 - EH961. 41 Stirling Mk.111. Part of a batch of 120 Short S.29 Stirling Mk.111. EH875-EH909; EH921-EH961;

EH977-EH996; EJ104-EJ127 Mk.111, of which EH897, EH950 and EJ106 were converted to Mk.1V. Delivered by Austin Motors Ltd between Jun43 and Sep43.
Contract No.B982939/39. Delivered to No.32MU 30Jun43, to No.620 Sqdn 11Jul43.
EH924 was one of three No.620 Sqdn Stirlings lost on this operation. See: BF511; EE906.
Airborne 2226 25Jul43 from Chedburgh. Cause of loss not established. Crashed at
Asperden, 3 km from Goch. Those killed were taken for burial in the Stadtfriedhof at Monchengladbach. They have been subsequently re-interred in the Rheinberg War Cemetery. Sgt J.D.Rathbone KIA Sgt J.H.Wallace KIA Sgt R.Wild KIA F/O J.F.Shepherd KIA Sgt J.F.Wells KIA Sgt A.Simons KIA Sgt D.H.Castling RCAF Inj Sgt D.H.Castling was confined to Hospital due injuries. No PoW No.

No other bomber lost that night crashed anywhere near Goch. So it looks as if EH924 was shot down by a German night-fighter. I hope somebody out there finds this information useful.

Thursday, April 29, 2010

Dell Latitude D800

I have an "oldish" laptop, a Dell Latitude D800 to be precise which I have had for a few years. I use it when on the road, and my wife uses it at home. Recently I have had problems turning it on. This morning it refused to start compeletely, some extensive googling for a possible solution, turned up this nugget of information posted by PaxRiverWino on TechRepublic.

I have copied the information below - just in case it ever goes missing.

THIS FIXES IT. Dell Latitude 800 no power, no light, does not boot

Take out the battery and unplug from the AC adapter. Open your computer so you are facing the screen. There is a grey strip behind the keyboard that is about 1 inch on the left and a half inch on the right. It will say LATITUDE|D800 on it. Pry it open on both ends. On the right you will see a plug in device. Remove it from being seated and then reseat it. Replace the strip. I'll bet you that is all it takes.

PaxRiverWino - too right - This did indeed fix it!

Friday, February 12, 2010

More detailed query plans, part 2 (textual output)

Posted by Dimitry to the Firebird Developer List

Getting back to this subject, I'd like to discuss possible textual formats for the structured query plans.

Below is a sample query used to demonstrate the idea (taken from the TPC-R suite):

select first 10
l_orderkey,
sum(l_extendedprice * (1 - l_discount)) as revenue,
o_orderdate,
o_shippriority
from
customer,
orders,
lineitem
where
c_mktsegment = 'BUILDING'
and c_custkey = o_custkey
and l_orderkey = o_orderkey
and o_orderdate < date '1995-03-15'
and l_shipdate > date '1995-03-15'
group by
l_orderkey,
o_orderdate,
o_shippriority
order by
2 desc,
o_orderdate;

This is a detailed plan output that could be shown for this query. It's based on the current binary access path format:

SELECT
-> FIRST N
-> SORT
-> AGGREGATE
-> SORT
-> LOOP JOIN (INNER)
-> FILTER
-> TABLE [ORDERS] ACCESS BY DBKEY
-> BITMAP
-> INDEX [ORDERS_ORDERDATE] SCAN
-> FILTER
-> TABLE [CUSTOMER] ACCESS BY DBKEY
-> BITMAP
-> INDEX [CUSTOMER_PK] SCAN
-> FILTER
-> TABLE [LINEITEM] ACCESS BY DBKEY
-> BITMAP
-> INDEX [LINEITEM_PK] SCAN

In fact, it's the real output which works in my private tree for a couple of months already, but I don't insist on the representation (inspired by Oracle and PGSQL), so feel free to criticize.

As soon as we have the new binary access path format (discussed in part 1) implemented, the output could look like this (just an example):

SELECT
[cost: 360000, rows: 10]
-> FIRST N (100)
[cost: 360000, rows: 10]
-> SORT ( DESC, o_orderdate ASC)
[cost: 360000, rows: 100]
-> AGGREGATE (SUM)
[cost: 350000, rows: 100]
-> SORT (l_orderkey ASC, o_orderdate ASC, o_shippriority ASC)
[cost: 300000, rows: 75000]
-> LOOP JOIN (INNER)
[cost: 150100, rows: 75000]
-> FILTER (o_orderdate < date '1995-03-15')
[cost: 75050, rows: 75000]
-> TABLE [ORDERS] ACCESS BY DBKEY
[cost: 75050, rows: 75000]
-> BITMAP
[cost: 50]
-> INDEX [ORDERS_ORDERDATE] RANGE SCAN
[cost: 50, used segments: 1]
etc

Another question I'd like to raise is the API to get the textual plan representation. fb_info_sql_access_path is expected to return the binary access path. We could have another tag e.g. fb_info_sql_access_path_as_text which works similar the current isc_info_sql_plan, i.e. performs the transformation on the server.

Another option could be to follow the fb_interpret() way and offer a client-side (actually, Y-valve) API call which would perform the binary-to-text conversion. The latter approach may look unreliable in the case of client/server version mismatch, but the worst possible thing for the client would be to get a reduced plan with unknown items printed as e.g. .

Comments?

RFC: Concept of the stored table/index statistics

Dimitry posted the following for discussion to the Firebird developers list.

Currently, the only stored statistical information is the index selectivity (per segment since ODS11). Number of records (aka cardinality) in tables is estimated using the number of data pages and table format size (aka table width). This information is far not enough to allow the optimizer making good decisions.

From another side, we have GSTAT which returns much more information which is very useful by itself (to DBA or developer) but which could also be used by the optimizer. And v3.0 is already offering even more details in the GSTAT output.

I was thinking about combining these two approaches together.

In the proposed new world, statistics would be stored inside the database in a binary form (read: as a blob) along with its header which includes: format version number, timestamp of its collection and probably some other fields. We could offer a built-in blob filter which translates the binary data into the textual form (e.g. looking like the GSTAT output).

It would contain all the data that GSTAT is currently able to report and even more, surely extensible in the future. It would consist of two parts: table statistics (complete -- including fields statistics, or reduced -- without fields) and index statistics (perhaps also in complete and reduced form, where complete one would contain e.g. value distribution histograms). I'd store them in RDB$RELATIONS.RDB$STATISTICS and RDB$INDICES.RDB$STATISTICS but the latter name is already in use. RDB$STAT_INFO? A separate table?

The optimizer would use that stored statistics to find better execution plans. If the statistics is considered being invalid/outdated, it could default to some simpler calculations, like the ones used currently, or it could still use the outdated statistics. There may be different rules for such a consideration, e.g. number of records in the stats vs the quick estimation based on data pages, or too old timestamp, or too big mismatch between the estimated cost and the real one calculated at runtime, etc. Threshold values could be configurable per database. An
invalid/outdated statistics would also trigger its re-calculation in the
background.

GSTAT gets new switches that would be used to:

(a) re-collect the statistics from disk and show it (as it works now)
(b) re-collect the statistics from disk and store it in the database
(c) show the statistics stored currently
(d) invalidate the stored statistics thus forcing a delayed re-scan

I'm not sure whether the default behaviour should be legacy (a) or (c).

It also gets sub-options that could control the level of details we need: only table level, including columns, including histograms, etc.

We could also add appropriate SQL commands to the engine, e.g.:

SET STATISTICS [FOR] {TABLE | INDEX} [options]
DROP STATISTICS [FOR] {TABLE | INDEX} [options]

or:

ALTER TABLE {SET | DROP} STATISTICS [options]
ALTER INDEX {SET | DROP} STATISTICS [options]
ALTER DATABASE {SET | DROP} STATISTICS [options]

or whatever. The current SET STATISTICS INDEX could be kept intact for backward compatibility or adapted to the new semantics.

Only table owners and DBA would be allowed to update/reset the stored statistics.

As you can see, there are many details that deserve discussions. I've intentionally omitted kinds of statistical values that might be stored and how they could be used.

But before going into the details, I'd like to have a basic feedback whether it's considered being a good concept or not.

I don't pretend to have the entire work completed any time soon, but I'd do my best in setting up the core infrastructure (which could later evolve into something wider) in v3.0.

Comments please.

Monday, February 8, 2010

Type 7 database pages and Firebird 2.x

Norman Dunbar posted the following on the Firebird Development list:

"I'm documenting the internal page formats of a database for the Doc project. I've checked in (to cvs) the document so far, but obviously a half finished document is no good in real life. To this end, I have a couple of questions on the internal formats of the type 7 database pages for Firebird 2.x, if you don't mind:

As before I have read Ann's (1.5?) documentation over at IBPhoenix/R&D, however, Firebird 2 seems to have changed things (slightly).

What compression is used in the page/btree_nod entries? I've got a database with a couple of known entries and I cannot decode the btree_nods sensibly. Happy to be pointed at the appropriate code file in the source.

Is there a difference in the page layout at all if the page I'm examining is not a leaf page? (btr_level != 0)"

Dimitry Yemanov replied.

"This isn't going to be easy to answer, as the page layout is much different between ODS10 and ODS11.

In all ODS incarnations, prefix compression is used for index keys. It means that the first key is stored "as is" and subsequent keys are stored as "deltas" represented by three values:
(1) length of the data that should be taken from the prior key (aka prefix),
(2) length of the data that is stored in our key (aka suffix), and
(3) the suffix itself which length is described by (2).

In ODS11, key internals (prefix length, suffix length, page number, record number) are also kinda compressed to store only the significant part of an appropriate integer value.

In ODS10, you'll see them of the fixed size:

struct btree_nod
{
UCHAR btn_prefix; --> prefix length
UCHAR btn_length; --> suffix length
UCHAR btn_number[4]; --> page or record number
UCHAR btn_data[1]; --> suffix data of btn_length bytes
};

Also, ODS11 has some special flags in the first byte of the index entry which allows to avoid storing prefix/suffix values at all in some cases.

Relevant source code: jrd/btr.cpp and jrd/btn.cpp.

> Is there a difference in the page layout at all if the page I'm examining is not a > leaf page? (btr_level != 0)

They're nearly identical. IIRC keys on non-leaf pages contain both page numbers and record numbers.

Also beware about the jump nodes introduced in ODS11. It's a sparse lookup table (key -> offset) which is stored on the page along with the keys themselves."

Ann Harrison also replied.

> They're nearly identical. IIRC keys on non-leaf pages contain both page numbers
> and record numbers.

That's right. Leaf pages contain nodes that have a header, data - possibly compressed, and the record number of the record that corresponds to the data. Upper level pages in ODS11 contain nodes that have a header, data, record number that contains that data, and the page number of the lower level page that starts with that
data value. The reason for "promoting" the record number to the upper level is not to give faster access to the record during a normal search, it's to avoid a garbage collection problem with indexes with lots of duplicates.

In ODS-10 and earlier, the algorithm for storing duplicates put the newest record first. So if you stored 100,000 records with the same key value and a generated primary key, the records would be stored in primary key order (more or less) and the index entries would be stored in inverse primary key order. When you delete
those records, you delete the lowest primary key value first. Unfortunately, the index entry for the key with duplicates will be at the end of the chain of duplicates, so you have to read 99,999 entries before you find the one you want.

Then you delete the second record and read 99,998 entries, etc. It's called
thrashing the cache.

In ODS-11 and higher, the record number becomes part of the key. Duplicates are stored in record number order. In effect, every key is unique.

> Also beware about the jump nodes introduced in ODS11. It's a sparse lookup table
> (key -> offset) which is stored on the page along with the keys themselves.

Jump nodes are like an index into an index page. Doesn't seem to make sense, but with pages larger than about 4K, Firebird was spending an inordinate amount of time reading across index pages - reading on average 1/2 the page size at each level. Jump nodes reduce the average read to about 500 bytes. As it turns out, the optimal size for an index page is different from the optimal size for data and jump nodes are a way of getting the best performance for both.

Note that systems that don't compress keys can use a binary search on an index page, so the size of an index page doesn't matter as much. On the other hand, they pay for that binary search in I/O.

Friday, February 5, 2010

Solaris, Firebird and Robust Mutexes

We have a large Firebird user on Solaris who noticed the following problem with the cuurent Solaris build (pre 2.1.4)

"If there are a bunch of fb_inet_servers running (or any other app like isql, Gpre type apps etc), then it is possible to kill one or more of these processes and hang up all the rest.

I suspect (hunch only) that some mutex or other has been created, and the killed processes can't release it...

The easiest way to get the problem to appear is to create 100 or so busy processes, and to start killing them until the problem appears.

Be nice if you had an idea of how to sort this.."

Cue conversation with Alex about the issue.

"This is known issue, though we have never been able to reproduce it, except using a debugger to stop in particular place and then kill the process. If some process locks a global mutex in the lock (or event) manager, and for some reason (e.g kill) the process dies when the mutex is still locked, then the mutex remains locked
forever. Non SolarisMT ports (like Linux or HPUX) do not have this problem.

The problem is solved in Firebird V2.5 and I think we can backport it to older versions, because it's well localized (related to mutex initialization), and it also seems it requires Solaris 10, but I am not sure whether the required system calls are present in the base release or whether an upgrade is required."

For reference - this is the code in Firebird 2.5, that fixes the issue:

#ifdef HAVE_PTHREAD_MUTEXATTR_SETPROTOCOL
int protocolRc = pthread_mutexattr_setprotocol(&mattr,
PTHREAD_PRIO_INHERIT);
if (protocolRc && (protocolRc != ENOTSUP))
{
iscLogStatus("Pthread Error", (Arg::Gds(isc_sys_request) <<
"pthread_mutexattr_setprotocol" <<
Arg::Unix(protocolRc)).value());
}
#endif
#ifdef USE_ROBUST_MUTEX
LOG_PTHREAD_ERROR(pthread_mutexattr_setrobust_np(&mattr,
PTHREAD_MUTEX_ROBUST_NP));
#endif
(this is mutex init code) and

#ifdef USE_ROBUST_MUTEX
if (state == EOWNERDEAD)
{
// We always perform check for dead process
// Therefore may safely mark mutex as recovered
LOG_PTHREAD_ERROR(pthread_mutex_consistent_np(mutex->mtx_mutex));
state = 0;
}
#endif

(this is checked if the mutex lock returns an error)

To make sure we can use this code Solaris must support the PTHREAD_MUTEX_ROBUST_NP attribute.

The answer to this is yes - Solaris does support it.

So we backported the relevant code and started the build only to find the following compile error

../src/jrd/isc_sync.cpp: In function 'int ISC_mutex_init(mtx*, SLONG)':
../src/jrd/isc_sync.cpp:3026: error: 'LOCK_ROBUST' was not declared in this
scope
../src/jrd/isc_sync.cpp: In function 'int ISC_mutex_lock(mtx*)':
../src/jrd/isc_sync.cpp:3049: error: 'mutex_consistent' was not declared in
this scope

To fix this you need to upgrade to libc version SUNW_1.23 as this was implemented in 2008 sometime.. see this link.

Tuesday, January 19, 2010

Stoves

Thats both a cooker and a maker of them. Follow this link.

I bought a Stoves 700DFD0 cooker, gas hob, grill, double oven in December 2003, and paid quite a chunk of money for it. seven years on, I would like my money back. Never will I purchase anything from Stoves again.

The top oven/grill of my Stoves freestanding double oven has now broken down / stopped working for the third time in four years, each time the same problem and the replacement of the same part, the thermostat. On previous occasions we have a, got a repair company to fix it, b ordered the part ourself, and got a friend to fix it, and finally this time after it failed after nine monthes we got our electrician/repair man, who is a good friend, to speak directly to them and try and find what was going on. It seems that this part has now been replaced by a new design, not because it keeps breaking down, just because ...it has.

So we ordered the new part directly from Stoves and he came round to repair the cooker. The new part we had been sent is also faulty. As is the replacement part (new thermostat) for the main oven (which currently has the endearing, if somewhat dangerous habit, of turning itself on when it feels like it).

Other issues, besides the fact it breaks down on a regular basis. The stainless steel hob gets stained by any burnt food and is uncleanable. Gas Burners which don't sit properly and hence won't light. Wire cooking racks which no longer fit into their supports. An oven temperature 20 degrees higher than it should be AFTER allowing for the fan and which is hotter at the back than the front.

So if you want a new cooker, think hard about buying a "Stoves".

Paul Beach's Blog