Oracle Observations

July 12, 2007

EAGAIN (again)

Filed under: AIX — bigdaveroberts @ 3:00 pm

One of the more interesting aspects of a blog, is the ability to see the search terms used by the user in the search engine that redirected the user to this blog.

Thus it is possible for me to know that almost every day someone searches for EAGAIN, and looks at my blog on performance problems using async I/O on AIX.

As that blog entry covers a number of issues, I think that it might be worthwhile to revisit this subject and dedicate a single post to the subject of EAGAIN warnings under AIX.

The history of the EAGAIN problem under AIX as I understand it.
(Based largely on supposition rather than hard fact!)

When IBM originally produced the Asynchronous I/O subsystem for buffered file systems on AIX 3, the solution implemented was sub-optimal, in that on occasions it would unnecessarily lock the inode, and not actually always be asynchronous.

Oracle then used IBMs asynchronous I/O API to implement async I/O on AIX.

There are then 2 possibilities as to what happened.

Oracle gave insufficient instructions in the setup guide concerning async I/O configuration in the AIX environment and when IBM re-wrote the async I/O subsystem Oracle began to generate EAGAIN errors indicating a poor configuration that had been hidden by the inefficient initial implementation.

or

When IBM re-wrote the async I/O subsystem they added an additional configuration parameter, which without an appropriate setting resulted in numerous EAGAIN warnings.

What certainly did happen, was that IBM introduced new bugs into the system which required several iterations of patches to resolve.

Whatever the cause, many people running Oracle on AIX encountered an increasing number of “Warning lio_listo returned EAGAIN” messages.

The response of Oracle was to blame AIX, as before the upgrade, the warnings were not occurring, and IBM blamed Oracle, as all they had done was improve the efficiency of their async I/O system.

What should you do if you encounter EAGAIN warnings under AIX.

Firstly you should ensure that the appropriate AIX operating system patches have been applied.

Test bos.rte.aio level with:

# lslpp -l bos.rte.aio
bos.rte.aio 5.1.0.25 COMMITTED Asynchronous I/O Extension

Secondly, you should accept that the eradication of EAGAIN warnings is not a guarantee that you have actually resolved the underlying problem nor that the existence of the occasional EAGIN warning indicates a problem.

As the basic explanation of the message indicates, the warning is an indication that the I/O system is not running optimally.

In AIX the async implementation consists of a single buffer to contain all disk writes, with multiple write processes executing the write instructions.

When an EAGAIN warning occurs, it is simply an indication that the async i/o write buffer is full, and Oracle will have to absorb the overhead of attempting to write the data to the buffer again.

If you increase the size of the buffer, you will reduce the number of warnings and slightly reduce the workload on oracle, however, this should not be your first consideration or goal.

The greatest way to increase the efficiency of the system is to increase the rate that disk writes are completed and thus removed from the queue by increasing I/O bandwidth (replacing RAID 5 with mirroring, using faster disks, reducing disk contention etc), then secondly you should look at reducing the number of disk writes added to the async I/O buffer by methods of redo reduction and deletion before and recreation of indexes after data loads.

It is only after using general methods of increasing i/o efficiency, that you should then turn to attempts to tune the async I/O subsystem itself.

You should consider that while using async I/O you can configure multiple processes to write to the hard disk simultaneously, a Hard disk can only physically write to one place on a hard disk at a time. Thus it is only through the combination of NCQ and disk buffers that implementing multiple write processes per disk will actually increase I/O. Thus even if you do reduce the number of EAGAIN warnings by increasing the number of write processes, that is not a guarantee that the speed of the system has been increased! Again by increasing then size of the async i/o buffer, you may well reduce the number of EAGAIN warnings, but if the memory utilised could have been better used to increase the size of the SGA, then the performance of the system may be reduced, even though the number of warnings has been reduced.

Obviously, if you haven’t changed the configuration of the system, and the number of EAGAIN warnings is on the increase, then that is an indication of a problem, but the solution may well not be in the realm of the DBA, it may be that new inefficient routines are being implemented by the developers.

In short, the EAGAIN warning itself should not be considered the problem itself, but rather it should be considered another symptom that if not eradicated, probably needs to be monitored and managed.

Further information:

IBM documents on tuning oracle on AIX:

http://www-941.ibm.com/collaboration/wiki/download/attachments/5570/Oracle_on_AIX_WebinarFeb2007.pdf

http://www.sioug.si/sioug2005/datoteka.jsp?filename=Tomaz%20Vincek%20-%20Oracle%2010g%20Performance%20Tuning%20v%20UNIX%20okolju.pdf

For the enthusiastic, AIX documentation about the lio_listo function:

https://www-rz.uni-hohenheim.de/betriebssysteme/unix/aix/aix_4.3.3_doc/ext_doc/usr/share/man/info/en_US/a_doc_lib/aixprggd/kernextc/async_io_subsys.htm

And an interesting metalink article:

34924.996

IBM response to query about aiostat (a tool that IBM supplied to analyse the volume of asynchronous i/o calls)

http://www-1.ibm.com/support/docview.wss?uid=std3295a41c3f8c73bda49256f66000ecf3d

April 12, 2007

Async I/O and all that.

Filed under: AIX,Unix — bigdaveroberts @ 10:10 am

While I do like to characterise myself as a DBA, I have only had one all nighter, which sort of exposes the conceit of my claim.

The system involved was a large (for the time) data warehouse, and the database had crashed unexpectedly on the day before the main weekly data load. (If I recall correctly, it was due to a control file timeout.) While it was not unusual for the system to fail, (mainly due to the fact that the database files were sparse, for some unknown historical reason making space management somewhat problematic), as the weekly load was due shortly the system was being monitored more closely on the evening that the main loads commenced.

After the data load stated the system crashed again. The system was recovered, looked fine, and the data load was initiated again, and again the database crashed. The cycle repeated. I believe that in all the database crashed 11 times in close succession.

To be honest I handed the problem over to another DBA in the morning, and they initiated a Disaster Recovery, while continuing to investigate the problem.

The first problem found was one identified by Oracle. We had recently enabled Parallel DML, and this had implicitly enabled parallel recovery. Oracle 8.1.7.2 apparently had issues with parallel recovery which made it somewhat less efficient than normal recovery. This meant that Oracle continued to perform recovery operations long after the database had been restarted.

The issue was that having had one crash increased the workload/contention in the database causing the multiple additional crashes.

What had caused the original crash?

Well, at least 3 issues.

1. A new contract developer had been recruited by the client who didn’t realise what a smart idea removing the indexes and recreating them after a data load was. This had over time, as his code had been promoted to live substantially increased the work done during the data loads.

2. There was contention between the control files and the redo logs due to poor file locations.

3. The system was running on AIX, and The oracle manual for configuring AIX asynchronous I/O was inadequate. While the manual had recommended increasing the maxioservers and the minioservers, it made no recommendations regarding the queue length.

There were also hints of a conflict between IBM and Oracle, in that it was suggested that the underlying cause of the Asynchronous I/O problem was from a rewrite of IBM’s Async I/O subsystem, and subsequently when errors began to appear from the Oracle database, Oracle blamed IBM, as Oracle had worked without errors before the change.

In terms of problem resolution.

The first moves were to disable parallel recovery with the assistance of Oracle support and a temporary suspension of the new data load, along with disabling Async I/O in oracle. (At my initiative.)

The result was that the crashes stopped.

I was instructed to re-instate async i/o by my team leader, as in his view, as the customer was paying for gold support we should get a proper resolution from Oracle rather than using a work around that would potentially reduce the performance of their system.

Further investigation revealed an increasing trend in [EAGAIN] warnings being logged by Oracle, and with the benefit of a large control file, a rather elegant graph of the deterioration in the time taken to archive redo log files (due to disk contention) was produced using the following SQL:

SELECT recid,
To_Char( next_time, 'YYYY-MON-DD HH24:MI:SS' ),
(completion_time - next_time) * ( 24 * 60 * 60 ) "Seconds to archive"
FROM v$archived_log
ORDER BY recid
/

We then spent a period of time, finding ‘colder’ disks for the location of the control files, and with the production of a script to log the number of [EAGAIN] warnings per hour we proceeded to tune the async I/O parameters sufficiently to minimise the number of warnings produced.

The end result were parameter settings far in excess of the recommendations in the installation guide, and only in fact limited by the number of async i/o processes that Oracle would start with. (This turned out to be about 350.) Also a script that automatically counted the number of [EAGAIN] warnings which alerted us by email if the number went over a certain threshold.

To an extent, that was the end of the problem. Although after a couple of months one of the developers spotted that a query would return differing results depending on which execution plan was used. After investigation, an attempt was made to rebuild the index, which failed. And having passed the call to Oracle we were recommended to upgrade to 8.1.7.3. After this upgrade had been performed, we were then able to rebuild the index. Unfortunately the disparity in the results persisted, and we eventually dropped and rebuilt the table.

While the corruption can’t be proven to be related to the large number of crashes encountered, it does seem to be more than co-incidence that a quite bizarre problem occurs shortly after a substantial number of system crashes.

Create a free website or blog at WordPress.com.