Wednesday, November 5, 2008

orapwd and standby in 11g

I was trying to build standby for Oracle 11g database to test new
features of Data Guard and stuck with problem that archive logs had
not been transferred to standby side because of the following error:

Error 1017 received logging on to the standby
------------------------------------------------------------
Check that the primary and standby are using a password file
and remote_login_passwordfile is set to SHARED or EXCLUSIVE,
and that the SYS password is same in the password files.
returning error ORA-16191
------------------------------------------------------------


The password files have been created on both sides using orapwd:
orapwd file=orapwdb11 password=sys
and remote_login_passwordfile parameters were set to EXCLUSIVE.
And I was able to login as sysdba to remote sites using either TNS or
easy naming methods.

After different attempts I recreated password files with password
in upper-case (thinking about case sensitivity for password in 11g):
orapwd file=orapwdb11 password=SYS force=y
and it worked - primary started to send logs to standby.

Trying to find the reason of such behavior I recognized that there is
the parameter that responsible for password sensitivity in 11g -
SEC_CASE_SENSITIVE_LOGON.
I set it to FALSE, changed passwords to lower-case, bounced databases
and... the same problem as before :(

That meant the parameter does not influence on case sensitivity of
password during sysdba connection.

And only after that case I found out that there is a new parameter for
orapwd utility - ignorecase

Usage: orapwd file= password= entries= force= ignorecase= nosysdba=

where
file - name of password file (required),
password - password for SYS (optional),
entries - maximum number of distinct DBA (required),
force - whether to overwrite existing file (optional),
ignorecase - passwords are case-insensitive (optional),
nosysdba - whether to shut out the SYSDBA logon (optional Database Vault only).


I changed
SEC_CASE_SENSITIVE_LOGON back to TRUE,
recreated password file with lower-case characters and ignorecase=y,
bounced databases and everything went smoothly!

Have a good day!

Monday, November 3, 2008

switching $ORACLE_BASE using oraenv in 11g

I recently installed a couple of Oracle 11g on VM. First one was installed
to /u01 but after testing different things I added one more to /u05.
But it does not really matter, the main thing that 11g raised importance
of $ORACLE_BASE variable added it to different places - oraenv script
(you have noticed sentence about oracle base after execution of the script)
and as parameter to init file.

All that means the $ORACLE_BASE should be dependent on
ORACLE_HOME and be saved somewhere. Moreover the script should
change it! But it was not so. Everytime I ran it $ORACLE_BASE left the
same even they based on the same path as $ORACLE_HOME:


[oracle@oel1 bin]$ . oraenv
ORACLE_SID = [db11106] ?
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1 is /u05/app/oracle
[oracle@oel1 bin]$ . oraenv
ORACLE_SID = [db11106] ? db11a
The Oracle base for ORACLE_HOME=/u05/app/oracle/product/11.1.0/db_2 is /u05/app/oracle
[oracle@oel1 bin]$ . oraenv
ORACLE_SID = [db11a] ? db11106
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1 is /u05/app/oracle


Checking the script I found the part that responsible for
oracle base change and comments there:


# Set the value of ORACLE_BASE in the environment. Use the orabase
# executable from the corresponding ORACLE_HOME, since the ORACLE_BASE
# of different ORACLE_HOMEs can be different.
# The return value of orabase will be determined based on the following :
#
# 1. Value of ORACLE_BASE in the environment.
# 2. Get the value of ORACLE_BASE from oraclehomeproperties.xml as
# set in the ORACLE_HOME inventory.



Going further I found that oraclehomeproperties.xml is under
$ORACLE_HOME/inventory/ContentsXML and it contains properties
and one of them is:

...
property name="ORACLE_BASE" val="/u05/app/oracle"
...

(/u01/app/oracle was for the other oracle home)

That's the place where $ORACLE_BASE and $ORACLE_HOME variable
are put together. But the script doesn't switch base if it was already set.
And the reason of it - orabase script ($ORACLE_HOME/bin).
If you have $ORACLE_BASE value set - orabase returns this value.
If not - it gets value from oraclehomeproperties.xml file:

[oracle@oel1 bin]$ . oraenv
-- switched to another home and got wrong $ORACLE_BASE
ORACLE_SID = [db11a] ? db11106
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1 is /u05/app/oracle
-- wrong value
[oracle@oel1 bin]$ echo $ORACLE_BASE
/u05/app/oracle
-- orabase returns wrong value
[oracle@oel1 bin]$ which orabase
/u01/app/oracle/product/11.1.0/db_1/bin/orabase
[oracle@oel1 bin]$ orabase
/u05/app/oracle
-- unset $ORACLE_BASE
[oracle@oel1 bin]$ export ORACLE_BASE=
[oracle@oel1 bin]$ echo $ORACLE_BASE

-- and now orabase returns right value
[oracle@oel1 bin]$ orabase
/u01/app/oracle



Finally I added the "export ORACLE_BASE=" entry before
"ORABASE_EXEC=$ORACLE_HOME/bin/orabase" to null the value that
orabase could accomodated the real value of $ORACLE_BASE variable.


[oracle@oel1 bin]$ vi /u01/app/oracle/product/11.1.0/db_1/bin/oraenv
[oracle@oel1 bin]$ vi /u05/app/oracle/product/11.1.0/db_2/bin/oraenv
[oracle@oel1 bin]$ . oraenv
ORACLE_SID = [db11106] ?
The Oracle base for ORACLE_HOME=/u01/app/oracle/product/11.1.0/db_1 is /u01/app/oracle
[oracle@oel1 bin]$ . oraenv
ORACLE_SID = [db11106] ? db11a
The Oracle base for ORACLE_HOME=/u05/app/oracle/product/11.1.0/db_2 is /u05/app/oracle


Have a good day!

Wednesday, October 29, 2008

adrci and Data Guard logs

I made an installation of Oracle 11g recently and had started to set up
Data Guard when got some errors during configuration. Since the new
version of databases came with Automatic Diagnostic Repository (ADR)
I thought it might be useful and interesting to check if adrci can
provide some view of not only alert log but Data Guard log files as well.

Unfortunately I did not find such possibility but at least dr...log file
can be added to an incident package if it is generated.

I hope in future releases ADR will integrate Data Guard logs as well.

Have a good day!

Tuesday, October 14, 2008

"ORA-10388: parallel query server interrupt" for streams process

It is almost a week of battle between streams process and us who support it.
It failed on weekend (as usual) with bunch of errors:

Errors in file /u01/oracle/admin/PROD/bdump/prod01_p001_21536.trc:
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x2A9808E358], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x2A9808E358], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x2A9808E358], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kghGetHpSz1], [0x2A9808E368], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kgh_heap_sizes:ds], [0x2A9808E358], [], [], [], [], [], []
ORA-07445: exception encountered: core dump [kghalf()+561] [SIGSEGV] [Address not mapped to object] [0xFFFFFFFFFFFFFFF0] [] []
...
ORA-01438: value larger than specified precision allowed for this column
...
ORA-10388: parallel query server interrupt (failure)


Restart of apply processes did not help neither bouncing a database.

After creation of Metalink SR and during substitution of tables in streams
with materialized views two patches were applied
(10.2.0.4 on Linux 64-bit):
7272297
"
ORA-00600 [17114] OR ORA-600[17125] - INSTANCE TERMINATION"
and 5868257
"MALFORMED RCI MARKER REDO DUE TO ORA-1551 IN UPDATE DRIVER"

but only the last one helped to avoid the issue and recovered streams
process as it was before.

Streams process failed on updates not for all tables but for some of them
which have many columns with names up to maximum length allowed.

So if you have such tables and plan to add them to streams I would
recommend to check the availability of the patch
5868257 for your database.

Have a good day!


Wednesday, September 17, 2008

How to exclude ORA- errors from alerts in Grid Control

There is a Note: 330996.1 on Metalink how to filter
certain alert.log errors from appearing in alerts.

There is example there how to ignore certain errors
from coming to alerts of database.

Unfortunately proposed expression doesn't work - might be some
characters hidden in html code of the page or might be simply missed.
Since it is based on regular expressions there must be clause
to ignore certain patterns using symbol ^.

After some testing I came with the expression like this
.*ORA-0[^(3333|2222|1111)].*
that allows to ignore ORA-01111, ORA-02222, ORA-03333 messages.

Also I removed "Generic Alert Log Error Status" metric to elude messages
about ignored errors.

The pattern was tested on EM GC Rel.4 on 10.1.0.4 on Linux.


Have a good day!

Tuesday, September 16, 2008

ORA-1873 accessing the SQL_RESPONSE_COLLECT metric

So you upgraded your databases to 10.2.0.4 and also recreated hc...dat file not
to hang Grid Control agent, enabled health check, started agent -
everything to fix consequences of bug https://metalink.oracle.com/metalink/plsql/showdoc?db=Bug&id=5872000 (see post http://dbadailytelegraph.blogspot.com/2008/08/killing-health-check.html)
and executed tail -f emagent.trc to see if some errors coming - 1, 2, 3 minutes,
everything looks to be alright.

Don't hurry up! :(

The errors are coming! And they are from anonymous PL/SQL block that runs
to measure "SQL Response Time (%)" metric and it scheduled to gather
information every 5 minutes by default:

ORA-06512: at line 7
2008-09-15 22:12:17,815 Thread-4064263072 ERROR engine: [oracle_database,test,sql_response_collect] : nmeegd_GetMetricData failed : ORA-01873: the leading precision of the interval is too small
ORA-06512: at "DBSNMP.MGMT_RESPONSE", line 1013
ORA-06512: at "DBSNMP.MGMT_RESPONSE", line 923
ORA-06512: at line 7

2008-09-15 22:12:17,816 Thread-4064263072 WARN collector: Error exit. Error message: ORA-01873: the leading precision of the interval is too small
ORA-06512: at "DBSNMP.MGMT_RESPONSE", line 1013
ORA-06512: at "DBSNMP.MGMT_RESPONSE", line 923
ORA-06512: at line 7


The search on Metalink leads to https://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_database_id=NOT&p_id=6737060.8
and affected (confirmed) versions of databases are 10.2.0.3 and 10.2.0.4
and it will be fixed in 11.1.0.7 and 11.2

Disabling the metric will help to avoid errors but it can be important metric at some stage.


Have a good day!

Wednesday, August 27, 2008

using EM Grid Control templates to change metrics

If you use EM Grid Control and have a lot of databases added there you
definitely have templates to manage metrics depending on what type of
databases you monitor. And of course you apply changes not only through
GUI but using emcli as well.

Are they the same? They should be but at least recently I found that
using emcli to apply metrics works I would say more correctly and allows
to avoid some errors on a stage of targets selection.

The code to apply template is simple:

./emcli apply_template -name="test template" -targets="test1:oracle_database" -copy_flags="1"

where copy_flags is 0 or 1 depending
on you willing to rewrite metrics or not.

Well, looks like it is not difficult to create or change metrics on number of
targets but what about to delete some of them based on criteria or at least
those named the same.

I stuck with it when the metric to monitor tablespace's free size percentage
was added wrongly:

* The metric "Tablespace Space Used (%)" named %UNDO%
with thresholds 110 and 110 for warning and critical;

* The template with the metric was applied on dozens of databases.

Even metrics were added and had 110 % they did not work properly -
something was wrong either with name (even there is a tip to use wildcards
for the same pattern) or with per cent which is more than 100.
And the task is just to avoid paging on undo tablespaces.

After many tests of different configurations for metric names and thresholds
I found out that only 100 per cent works for thresholds and name should
not be with % symbols.

The template was changed to have a new metric name
UNDOTBS01 (100 and 100 %%) and I was ready to apply it using emcli but
what about old supposed-to-be-working ones?
How to delete them all at once or to generate a script for emcli?

Unfortunately emcli has no such command to change existing metrics
on target and I was forced to delete wrong metrics one by one manually
through Grid Control interface.
It was not quick and I spent about an hour routinely clicking the same buttons.
And I still have no idea how to do it massively.
To trace EM actions? To search Metalink?
To use tricks and tips in existing EM version?
I will try and

you have a good day!


All said before is for EM Grid Control 10.2.0.3
10.2.0.4 came with tips that
"
Wildcard characters for object names are not supported."
"The "warning and critical threshold combination"
should be either (0,0), (NULL, NULL)
or (0 < warning threshold < 100, 0 < critical threshold < 100,
warning threshold <>


Attempt to apply metric for non-existing tablespace
will be executed successfully
but it will not be there after update.

Thursday, August 14, 2008

"Refreshed" database files naming

I have been doing many "refreshes" last days.
One of scenarios is to refresh a development
or test database from backup of production database.
The databases are on ASM and names consist dev
or test or prod in them.

Nothing difficult about it - get backup restored to disk,
start instance in nomount mode, restore controlfiles,
catalog backupsets, restore and recover database and open it.
And rename a database after the procedure using nid utility.

But the thing is that if you restore a database from
backup to OMF destination as ASM or file system,
oracle will use database name (if db_unique_name
is not set to something different) to name folders
where datafiles are going to be.
That means files will appear in folders named as source database
and will have "prod" or "prd" (or whatever the pattern is)
in folder names.

It can be nothing bad with it, but if you are going to restore
the same database under different name on the same server
you will get problem. Not to mention violation of all standards.
And also even after successful open of restored database
and its renaming you will have all files staying at the same place.

So if you would like to avoid it put db_name as source one
to you initialization file and add db_unique_name set to the name
of what database will be renamed to later. This will help
to restore datafiles to right location and you will need
just to rename the database afterwards.

Have a good day!

Tuesday, August 12, 2008

wrong results for 10.2.0.4 with cursor_sharing='SIMILAR'

If you running Oracle 10.2.0.4 using cursor_sharing='SIMILAR'
you probably already aware about a bug that causes database
to produce wrong results for some type of sql statements.

If you have not run into it yet, try to execute the following
statement with cursor_sharing set to SIMILAR (either on system
or session level)

select t1.d1, t2.n1
from
(select 1 as d1 from dual d1, dual d2) t1,
(select 1 as d1, 2 as n1 from dual) t2
where t2.d1(+) = t1.d1
group by t1.d1, t2.n1;


If you got 1 and null in output - this is it, you hit the one.

Further change of cursor_sharing to 'EXACT' will return
1 and 2 and if after that you want to get bad results again
(changing to 'SIMILAR' back) nothing happens - from now on
you will see good results (for this particular cursor) until
you bounce a database.

The bug 7277101 is also applicable for 11.1.0.6 and
unfortunately has not fixed yet.


Have a good day!

Thursday, August 7, 2008

Killing health check

Do you think health check can kill?
Well... at least it can make oracle agent "not running" and left hung
agent processes.

It is not a common situation for all versions of agents and databases.
The problem affects oracle agent 10.2.0.4 and either 10.2.0.3 or
11.1.0.6 databases on Linux x86-64 platform.

The symptoms like this: you just upgraded you agents up to
10.2.0.4 version, saw fine ("green") status in Grid Control
and later (or next day) you see agents down and not running.
What happened?

Most likely, you hit the problem described in Note:566607.1
and agent opened too many hc_database_name.dat files
(which are in $ORACLE_HOME/dbs) and crashed.

That is because it was trying to open file using
a slightly different structure to access health check files
(1544 bytes instead of 1552 as it was before) but failed
and left file used. It continues until "too many files opened" error.

As a remedy patch for Bug 5872000 can be applied to database
or health check of instance can be disabled in Grid Control.

Probably the best one will be to apply patch but
it requires to bounce database and that makes me thinking
that I should prepare upgrades even of agents
carefully since it can even lead to patch a database!


Have a good day!

Wednesday, August 6, 2008

Silent upgrade of Oracle Agent to 10.2.0.4

Recently I got a task to upgrade RAC-aware
oracle agents from 10.2.0.3 to 10.2.0.4.

I presumed that installation available on OTN for mass
deployments will do the trick but I failed on that assumption.

Finally I recognized that upgrade should be done with patch
3731593 which is available on Metalink.

Due to security reasons there is only access to command line
on the server and that led me to use silent installation of the
patch. It was not the first time when I used it so I started to
put necessary values to the copy of patchset.rsp file.

Looking through the contents of the file I discovered
with pleasure that there is no entry for ORACLE_HOME_NAME
variable there. It is not a problem to find out it (/etc/oraInst.loc shows
location of oracle inventory and there is ContentsXML/inventory.xml
file there with information about oracle home paths and names)
but I can't see the reason to use it during installation.

Anyway, I made a file for silent installation and executed installer
(stopped agents on each host before it) - got an error

3731593/Disk1$ ./runInstaller -silent responseFile /patch/3731593/Disk1/response/patchagent10204.rsp
...
OUI-10202:No response file is specified for this session.
A silent session requires inputs from a response file or
from the command line. Specify the response file and re-run the installer.


Ok sounds familiar. How about to add "=" after responseFile parameter
Ran it, got an error again

./runInstaller -silent responseFile=/patch/3731593/Disk1/response/patchagent10204.rsp
...
SEVERE:Values for the following variables could not
be obtained from the command line or response file(s):
ORACLE_HOME(OracleHome),
ORACLE_HOME_NAME(OracleHomeName)
Silent install cannot continue.


Great! ORACLE_HOME_NAME had appeared!
And even ORACLE_HOME was not recognized :(

Finally I added those 2 variables to command line
and successfully executed upgrade of oracle agent
(remote installation also was completed without
problems)

./runInstaller -silent responseFile=/patch/3731593/Disk1/response/patchagent10204.rsp
ORACLE_HOME="/u01/app/oracle/product/10.2.0/agent10g"
ORACLE_HOME_NAME="agent10g"

And moreover the installation did not ask to run any root.sh scripts
which saved SA's time :)

After upgrade everything went smoothly - new versions of the agents
had been discovered in Grid Control, files were uploaded and
monitoring had resumed. Lucky me!

And the great news - I got Troubleshooting Oracle Performance book!

What a good day!
Have a good one too!

Tuesday, August 5, 2008

Synchronize your bookmarks!

In our days it is important for DBA to have access to variety of
information that is provided by different sources over Internet -
forums, blogs, technical support sites etc.

All information of course can be downloaded to local PC
but definitely it is simpler to make a bookmark to have
a quick access to information.

I personally create links for Metalink notes,
for Oracle articles on OTN etc.
And until recently it was a problem for me to have these
links available and synchronized on
a quantity of computers that I use every day.

But now I am using Foxmarks Bookmark Synchronizer
(works with Firefox 1.5-3.0) which helps me to save
bookmarks on any of computers and use them later
from any other one.

Try it and feel relief!

Have a good day!

Monday, August 4, 2008

Optimisation of "SELECT * FROM"

This is my first post on the blog that I created recently. The name of the blog implies to have
regular updates on the blog with articles about variety of database administration aspects.
Will it happen? Well... at least it is in the name :)

The first one is going to be the short one and it is about optimization of the important for every
dba statement "SELECT * FROM". You probably think that I am going to tell you some secret
about optimizer not exposed in all details recently by gurus (since it is my first post, right?)
No, it won't happen. I just want you to think how many times per day you type
"SELECT * FROM " or similar statement (with addition of rownum for example)?
Do you remember how many times you made errors in that simple statement
when you were in hurry fixing something and/or using unaccustomed keyboard?

How about to simplify work using a script like this:


SELECT * FROM &1
/
undef 1


If you save it under the name s.sql you can use it like in the manner:


SQL> @s dual


Moreover, you can enhance scripts to select 10, 20, etc or based on like criteria rows from tables.

I use it to select table names from dictionary when I can't recall exact names and underscore
symbols whereabouts there.


dict.sql:

select table_name from dict where table_name like upper('%&1%')
/
undef 1


@d tables


As soon as you get how helpful your scripts are you can start using them over Internet:

@http://<...>/s.sql



That's it for today.
Have a good one!