DBA Masterclass 2020

Today I took the DBA Masterclass Quiz (truthfully just a bit too easy…) and earned the above badge from Oracle.

Oracle put on a three class event that was fun and informative. You can watch the recordings even though the live presentations are finished.

For me, one of the best outcomes was connecting directly with Russ Lowenthal (@RussLowenthall), a member of the Database Security Team, and have a great conversation about fixing the whole tls_wallet (or ssl_wallet if you are a bit older) issue with requiring DBAs to download root and intermediate certs so that the database can access TLS encrypted URLs. I presented Russ with some interesting ideas which he said the database team at Oracle would consider for future releases. Basically we’d all like the ability for the database to ‘just work’ with signed URLs the same way your browser just works.

Here’s to hoping the security team gets around to fixing this!


Oracle Backup Terminology

Oracle backup terminology can be kind of confusing. And, by can be, I mean that it is confusing. There are lots of different terms that sound similar but mean very different things. I’ve seen very experienced DBAs use the following statements synonymously: “I just did a whole backup” and “I just did a full backup”. The correct response to “I just did a whole backup.” is “Thanks! I appreciate it.”, while the correct response to “I just did a full backup.” is “Of what?” or maybe even “Why?”. Because, according to Oracle’s terminology, those two statements mean quite different things.

I’ll be defining the terminology that Oracle uses in their training material and documentation and, in addition, making suggestions on what your backups should most often look like.

Backup Strategy – Whole or Partial

Every Oracle backup is either a whole backup or a partial backup.

A whole backup in Oracle terminology means that you’ve backed up all data in all datafiles and at least one control file. Since the control files (you have configured multiple control files, right?) are copies of each other, you technically only have to backup a single control file.

So what is a partial backup? It’s a backup that isn’t all data in all datafiles and a control file, but instead less than that. If you backup a single datafile, you’ve done a partial backup. If you backup all datafiles but one, you’ve done a partial backup. If you backup all data in every single data file, but don’t back up at least one control file, you’ve done a partial backup.

Backup strategy tells you how much of your database you are backing up. You are either backing up all data in all datafiles and at least one control file, or you are backing up something different than that.

For your backup strategy you should start with a whole backup of your database. After that initial backup you can either do partials forever using the Oracle Suggested Backup Strategy , or you can bounce back and forth between whole and partial backups. To me, the least effective backup strategy is to always do whole backups, but some folks do use this strategy. I much prefer to start with a whole backup and then do partial backups from then on using the Oracle Suggested Backup Strategy.

As an aside, notice that a whole backup doesn’t technically need an spfile to be included in the backup, but I always include an spfile in all my whole backups.

Backup Type – Full or Incremental

The next term that Oracle uses to describe a backup is Backup Type. Every backup is either a Full Backup or an Incremental Backup.

And… Welcome to the start of the confusion! Whole does not equal Full. While you can do a whole full backup, you can also do a partial full backup.

Backup type tells you how the backup can be used relative to other backups. Generally, a full backup stands on it’s own and isn’t used in conjunction with other backups. A full backup backs up all the data in whatever you are backing up. An incremental backup can be used with other incremental backups and, depending on it’s level (which I’ll explain in a moment) either backs up all the data in whatever you are backing up, or all the data that has changed since your previous incremental backup.

Incremental Backup Types – Level 0, Level 1 Differential, Level 1 Cumulative

To further increase backup type complexity, incremental backups have different types themselves. Incremental backups are either a Level 0 backup or a Level 1 backup. And, to make backup types even more confusing, Level 1 backups are also of different types! Incremental Level 1 backups are either Cumulative or Differential backups.

Let’s start with a Level 0 backup. A Level 0 backup is identical to a full backup in that it contains all of the data in whatever you are backing up (remember, it could be either a whole or partial backup that you are doing), but it has one additional property: It can be used as the base backup for later incremental level 1 backups.

A Level 1 backup contains only data that has changed since a previous backup. Which previous backup? Well, it depends on the incremental Level 1 backup type. If it is an incremental Level 1 Cumulative backup, then it will always contain the changes since the last Level 0 backup. An incremental Level 1 Differential backup, on the other hand, will contain the changes since the previous incremental backup whether it was a Level 0, Level 1 Cumulative, or Level 1 Differential.

One might ask: Why do we have these two different Level 1 backup types?

It’s basically a balance between the size and speed of the backup (the first differential or cumulative backup taken after a level 0 backup are the same, but, generally subsequent differentials should be faster and smaller than subsequent cumulative backups), and the time to recover datafiles past the level 0 restore point. A recover process of a single cumulative backup should be faster than recovering 6 differential backups. However, if you have almost all add operations, then the difference could be negligible. If you have many update operations, then the difference could be significant.

Personally I think it is really an artifact from the early days of computing when backups went to tape, and the tapes drives required operators to physically mount and unmount the tapes. If you had 6 incremental differential backups, you might have to physically mount and unmount 6 different tapes if you wanted to restore and recover data in your database. Each time you had to do this, you’d get a prompt from the terminal telling you to go find this particular tape, and each time you did that your restore and recover operations would take a bit longer.

These days we don’t really have to worry about this so much. Most of the time, we just do differential backups when we do level 1 backups because the files that make up the backup are usually located in one place instead of separate physical tapes and we don’t have to mount and unmount tape drives any more, or if we do, the capacity is massive compared to the early days of computing and we are not bouncing from tape to tape like we did in the past.

For backup types, I always start with an incremental level 0 backup. In general, I don’t usually do full backups. Since an incremental level 0 is nothing more than a more flexible full backup, a whole incremental level 0 is usually the way to go for a base backup. Once that is completed, I usually follow that up with incremental level 1 differential backups and then move my level 0 backup forward in time by applying previous level 1 incremental backups to the level 0 backup, effectively moving the level 0 backup forward in time.

Backup File Type – Image Copy or Backup Set

So… Here’s another source of confusion. Backups generate output files, and there are two different types of output files: Image Copies or Backup Sets.

An image copy (which RMAN, the tool Oracle supplies to manage and use database backups, just shortens to COPY) is an exact bit for bit duplicate of the file. It includes all used and unused space in the file. So, if you’ve created a 1 TB datafile and only have 10 MB of data in it and you create an image copy backup of this file, you’re output file will be… 1 TB.

With backup sets, instead of taking an exact copy of the whole file, Oracle just extracts the actual information from the file and then creates a new file (or multiple files if specified and/or necessary) that contains the necessary information. Additionally, this file can also be compressed, so it is often much, much smaller than an image copy backup. That 1 TB datafile with 10 MB of data in it backed up as a backup set could, potentially, be even smaller than 10 MB.

Further more, image copy backups are always either full backups or incremental level 0 backups. If you have an incremental level 1 backup (either kind!), then it can’t be an image copy (it only has information that has changed since a previous backup) and therefore it will always be a backup set.

Since backup sets are so efficient, one might wonder why you’d ever create an image copy. It turns out that the efficiency of a backup set (much less storage used for the backup) is limited to the creation process. When it becomes time to recreate the original file (called a restore in Oracle terminology, which is done from either a full or incremental level 0 backup) all the data in the backup set must be read and the original file is recreated step by step until you have a copy of the file at the point in which it was backed up. With a restore we are going to create a bit for bit copy of the original file at the time of the backup… and that is exactly what an image copy is! So, instead of recreating the original file, if you have an image copy backup, you can actually point to the image copy and tell the database to use it directly. This means that the time to restore the file from an image copy can be effectively instantaneous. The time to restore a file from a backup set depends on the size of the backup set, so as your backups get bigger, your restore time gets longer if you are using backup sets.

The computer science way of saying this is:

  • A restore operation of an image copy backup can be an order 1 operation.
    Using the RMAN switch command, the amount of time to restore a 1 MB datafile is the exact same amount of time to restore a 32 GB datafile.
  • A restore operation of a backup set is always an order N operation.
    We can’t switch to a backup set, so we’ll have to read all the data and recreate the datafile from scratch. So a 1 MB datafile will restore much more quickly than a 32 GB datafile.

Backup File Destination – Disk or Tape

In the earlier days of computing computer storage was broken into two distinct buckets: Disks which were faster but extremely expensive per storage unit, and tapes which were slower but much less expensive per storage unit. These days the lines between disk and tape have become a bit blurred with cloud backups that can appear as either disk or tape and could go (on the cloud) to disk or tape under the covers (under the cloud?). Also, the cost differences generally favor tape.

As far as Oracle backups go, backups are written to one of two different devices: DISK (self explanatory) and SBT (which stands for System Backup to Tape). It’s important to understand that these device types are logical rather than physical. If you configure some AWS or Oracle Cloud Object Storage in the cloud as a local drive mounted to your computer and do a disk backup to it, as far as Oracle is concerned the backup went to disk (even though it went to ‘the cloud’). If you configure a the SBT driver to point to some disks, Oracle will consider the backup written to this device to be a tape backup. The Oracle Database Cloud Backup Module, for example, turns cloud storage into a ‘logical tape drive’.

Backups to the logical device type disk can be either image copies or backup sets. Backups to the logical device type SBT however can only be backup sets.

So, it seems kind of clear: You can backup up to image copies or backup sets to disk, and backup sets can go to tape… but, guess what (remember, this can be a bit confusing). It turns out that there are two different kinds of ‘device type disk’ backups. Oracle gives you the ability to define a special disk location called the fast recovery area. Of course, just to make things fun, when this special location was first introduced it was called the flash recovery area. When Oracle introduced this term it was before the dawn of flash disks (usually called solid state drives now). Since this could be confusing(!), Oracle decided to change the name of the flash recovery area to the fast recovery area. At least the abbreviation for the fast recovery area is the same as it always was: FRA.

So, what’s the difference between a disk backup to the FRA and a disk backup to ‘not the FRA’? Really, nothing. They are the same. There is no ‘different information’ in a backup to the FRA vs. a backup to ‘not the FRA’. However, backups to the FRA are managed differently by Oracle.

The FRA is defined by setting two Oracle database parameters (not RMAN configuration settings as one might expect): DB_RECOVERY_FILE_DEST which points to a location logically on the server, and DB_RECOVERY_FILE_DEST_SIZE which determines how much space is logically allocated for this particular database to use. It’s important to realize that the db_recovery_file_dest_size is logical, not physical. This means that if you point to a mount point (let’s say you set db_recovery_file_dest to /u02/app/oracle/fast_recovery_area) that has 1 TB of storage allocated to it, but you set the db_recovery_file_dest_size to 2 TB, Oracle won’t complain in the least… until you actually write more than 1 TB of information to the FRA. At that point you’ll get some out of space errors from the OS and whatever operation Oracle was trying to do in the FRA will fail. So, obviously, that would be just silly to do. Let’s assume that you are going to set your FRA logical size to be something that makes sense for the system you are running Oracle on. What does using this ‘logically identified and size space’ get you? Well, if you put items into the FRA and they are no longer needed to meet your retention targets (these are defined with RMAN configuration settings) then Oracle will, if it can, automatically remove no longer needed items for you. If you do not have those two database parameters set then your disk backups go to ‘not the FRA’ and you have to manually maintain the space. If you do have those two database parameters set, and you do a disk backup to ‘someplace other than the FRA’ then again, you’ll have to manually maintain the space.

So far we have covered the following Oracle Backup Terminology:

  • Backup Strategy
    • Whole
    • Partial
  • Backup Type
    • Full
    • Incremental
      • Level 0
      • Level 1
        • Differential
        • Cumulative
  • Output File Type
    • Backup Set
    • Image Copy
  • Output File Location
    • Tape (SBT)
    • Disk
      • FRA
      • not FRA

For now, this seems like quite a bit of ground to cover, and this post is getting quite long. At some point in the future I’ll attempt to cover other things like the following:

  • Database Log Mode
    • ARCHIVELOG
    • NOARCHIVELOG (I call this ‘polish your resume mode’.)
  • Backup Mode
    • Consistent/Cold
    • Inconsistent/Hot
  • Using Created Backups
    • Restore
      • SP Files
      • Control Files
      • Data Files
      • Archive Logs
    • Recover
      • Database Data Files
      • Image Copy Backups (Incrementally Updated Image Copies)
    • Creating Database Clones
    • Point In Time Recovery
  • Backup Targets
    • Database
    • Data Files
    • Archivelogs
    • FRA
  • Instance/Database Types
    • Target
    • Auxiliary
    • Catalog

And more?

Happy DBA’ing!


Too many audit logs? xargs to the rescue!

After watching a client pipe the results of a find command into a text file, and then edit the text file to add rm in front of each line and then turn the text file into a script and run it… I knew I needed to add this here.

Sometimes you have too many audit logs and rm *.aud returns the following:

-bash: /bin/rm: Argument list too long

If so, we can use xargs to remove them all:

find . -type f -name '*.aud' | xargs rm

Happy Linux’ing.


Naming Oracle Container Databases

containers

The introduction of container databases in Oracle 12.1 created a whole new world of Oracle Databases dividing them into non-Container Databases (we’d been using these all the way through Oracle 11g, we just didn’t know it) and Container Databases.

Starting with Oracle 20c all Oracle Databases will be container databases, so we’ll no longer be able to stick with the (to many at least), more familiar non-Container Database architecture.

From the 20c documentation:

Note: A multitenant container database is the only supported architecture in Oracle Database 20c. While the documentation is being revised, legacy terminology may persist. In most cases, “database” and “non-CDB” refer to a CDB or PDB, depending on context. In some contexts, such as upgrades, “non-CDB” refers to a non-CDB from a previous release.

Great! That won’t be confusing at all… (Yes, I’m being sarcastic). Hopefully, the documentation will be revised very quickly.

I’ve been teaching the Oracle Education Oracle Database: Managing Multitenant Architecture course ever since the release of 12.1 and I’ve additionally been managing multitenant container databases for multiple customers for many years now.

Over the years I’ve seen many different naming conventions for naming multitenant container databases and the pluggable databases that are deployed inside them.

Oracle, and many of Oracle’s customers, have been referring to the Root Container (CDB$ROOT) of a container database as the CDB, and the pluggable databases inside a container database (which, according to the Oracle documentation are also containers themselves) as a PDB.

One big issue with the CDB/PDB terminology is that it can be hard, at least in English, to distinguish between CDB and PDB when speaking. So, naming a container database cdb1 and naming a pluggable database inside of it pdb1 can make things very hard to understand when having conversations about container databases. “Wait, did you say cdb? Or pdb?”

The container database architecture is really designed to be used with the clients and/or applications connecting to the pluggable databases as if they were the old style stand-alone databases (or non-container databases are they are now called). Generally, nobody except the DBA team is going to be connecting to the root container. Additionally, and probably most importantly, regular database users and/or applications don’t care that they are connecting to a pluggable database. So, adding pdb as a suffix to all your pluggable databases really doesn’t add any value.

I feel pretty confident that the current naming scheme that I’m using with many of my customers actually works in real life. It’s basically two rules:

  • Rule One:
    Name the actual container database with a “con” suffix.
  • Rule Two:
    Name the pluggable database what you would have previously named a standalone database and never add a pdb suffix or prefix.

Let’s look at some examples:

  1. You want to create a ‘playground’ database named orcl.
    The root container would be named orclcon and there would be a single pluggable database named orcl.
  2. You want to create a container database on a production server.
    The root container would be named prodcon and there would be a pluggable database named prod.
  3. You only have one database server (and therefore it is the production server), but you want three databases, prod, test, and dev.
    The root container would be named prodcon and there would be three pluggable databases: prod, test and dev.
  4. You have three database servers, let’s say dev, test, and prod.
    Three container databases named devcon, testcon and prodcon, one per server, each containing a single pluggable named dev, test, and prod respectively.

    Here, however, things could get a bit more interesting. Let’s say we decide to give each developer their own pluggable database. Our dev team is composed of Jill, John, and Jane. We’d still have devcon as the name of the container database that supports development databases, but instead of a single pluggable named dev, we’d have three pluggables named jill, john, and jane.

    Now let’s say that on your test server, you decide that you are going to have four databases: test, integration testing, quality assurance, and training. We’d still have testcon as the name of the container database, but we’ll now have four pluggables named test, int, qa, and train.

After playing with different naming schemes over the years, I’ve found that the naming scheme above works the best in the real world.


Connecting to Oracle with Passwords with Spaces

PasswordWithSpaces

Every once in a while I’ll be in the situation where two things intersect: I’ll want to log into the database using the command line directly (using sqlplus, sqlcl, expdp, rman, etc.) and someone has created a password with spaces in it for the user I want to connect as.

One might assume that you’d only have to escape the double quotes around the password (using \” ), but it turns out you actually need to provide an extra set of double quotes around the password when it has spaces in addition to the escaped double quotes.

[oracle@thecloud ~]$ sqlplus sys@orcl as sysdba
SQL*Plus: Release 18.0.0.0.0 - Production on Sat Dec 14 06:14:33 2019
Version 18.6.0.0.0
Copyright (c) 1982, 2018, Oracle.  All rights reserved.
Enter password:
Connected to:
Oracle Database 18c EE Extreme Perf Release 18.0.0.0.0 - Production
Version 18.6.0.0.0
SYS@orcl AS SYSDBA> create user bob identified by "Password with spaces";
User created.
SYS@orcl AS SYSDBA> grant create session to bob;
Grant succeeded.
SYS@orcl AS SYSDBA> exit

[oracle@thecloud ~]$ sqlplus bob/\""Password with spaces\""@orcl 
SQL*Plus: Release 18.0.0.0.0 - Production on Sat Dec 14 06:18:26 2019
Version 18.6.0.0.0
Copyright (c) 1982, 2018, Oracle.  All rights reserved.
Connected to:
Oracle Database 18c EE Extreme Perf Release 18.0.0.0.0 - Production
Version 18.6.0.0.0
BOB@orcl > exit

If bob’s password was just Password and not “Password with spaces” then we’d be able to connect using either of the below:

[oracle@thecloud ~]$ sqlplus bob/Password@orcl
SQL*Plus: Release 18.0.0.0.0 - Production on Sat Dec 14 06:18:26 2019
Version 18.6.0.0.0
Copyright (c) 1982, 2018, Oracle. All rights reserved. 
Connected to: 
Oracle Database 18c EE Extreme Perf Release 18.0.0.0.0 - Production 
Version 18.6.0.0.0
 BOB@orcl > exit
[oracle@thecloud ~]$ sqlplus bob/"Password"@orcl
SQL*Plus: Release 18.0.0.0.0 - Production on Sat Dec 14 06:18:26 2019 
Version 18.6.0.0.0
Copyright (c) 1982, 2018, Oracle. All rights reserved. 
Connected to: 
Oracle Database 18c EE Extreme Perf Release 18.0.0.0.0 - Production 
Version 18.6.0.0.0
BOB@orcl > exit

Once the password has spaces in it, you must include the ‘double double quotes’ and escape the first set of quotes as in the first example.

 


Solved (?) SQL Developer 19.2: No more authentication methods available when using SSH Connections

When Oracle upgraded SQL Developer from 19.1 to 19.2 they updated some libraries. One of those updates meant that SSH keypairs that had been generated by PuttyGen didn’t work anymore and you should “Generate OpenSSH format ppk” (see Oracle Support Doc ID: 2575854.1)

However, when attempting to do so using PuttyGen it didn’t seem to work. Taking a step back, here was the situation in 19.1 or earlier:

  1. You had SQL Developer 19.1 running on your desktop and had used the SSH Hosts dialog to create an SSH connection pointing SQL Developer at the private key on your machine (typically, this is an id_rsa file).
  2. The server you are connecting to has the associated public key (typically, id_rsa.pub) in the authorized_keys file associated with the account you were connecting to on the remote server.
  3. Everything worked as expected and you could connect to databases on the remote database server using the SSH connection.

After an upgrade to 19.2 (during which SQL Developer will ask you to import settings from a previous release), there is a chance that the private key that you previously had pointed to will no longer work.

My ‘used to work in 19.1 and earlier’ private key file looked like this:

-----BEGIN RSA PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: AES-128-CBC,abcdefghijklmnopqrstuvwxyz123456
                      
abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqrstuvwxyz12
[SNIP]
abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqrstuvwxyz12
-----END RSA PRIVATE KEY-----

If I opened that file in Putty Key Generator on a Windows 10 box, it said that it was already an OpenSSH SSH-2 Private key.Puttygen Notice

Since Oracle Support and others said that the key needed to be in OpenSSH format, one would assume that maybe you’d use the Conversion feature of Puttygen to convert the key…

Puttygen Conversion

Nope. No go.

After playing around for a long time I discovered the OpenSSH commands that are now built into Windows 10. After testing a million things, this is what actually worked:

C:\sshkeys> ssh-keygen -e -f id_rsa > id_rsa_for_sql_developer_19.2
Enter passphrase:

Once pointed at the id_rsa_for_sql_developer_19.2 file, SQL Developer happily prompted me for my (very long) passphrase and established a connection to my database.

Great! Problem solved.

However, and this is the strange thing that I don’t really understand… If you look at the contents of the generated file, you’ll see something like this:

---- BEGIN SSH2 PUBLIC KEY ----
Comment: "2048-bit RSA, converted by rich@ROTOR from OpenSSH"
abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqrstuvwxyz12
abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqrstuvwxyz12
abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqrstuvwxyz12
abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqrstuvwxyz12
abcdefghijklmnopqrstuvwxyz1234567890abcdefghijklmnopqrstuvwxyz12
abcdefghijklmnopqrstuv
---- END SSH2 PUBLIC KEY ----

Yep, that is what it looks like, a public key in SSH2 format (although I obviously obfuscated the above, for what reason, I do not know)…

Why an SSH2 Public Key works when you should have a private key (if I understand SSH connections correctly, and this sort of makes me feel like I don’t), I have no idea. But work it did, so I happily moved on.

As an aside, I did have an SR open with Oracle on this which I ended up closing myself, but I did ask them to update Doc ID: 2575854.1 with the above information before I closed it.

Happy SQL Developer-ing!


orabasetab is wrong if you install as grid

Traditionally when you install the Oracle Grid Infrastructure for RAC, the owner of the GI software is a user named grid and the owner of the database software is a user named oracle.

With the 19c version of the GI, the software is shipped as a zip file that you extract. Unfortunately, there is a bug (if you are doing a traditional install) in this extract where the orabasetab file (located here if you are using a traditional OFA compliant install:   /u01/app/19.0.0.0/grid/install/orabasetab) has the following contents:

#orabasetab file is used to track Oracle Home associated with Oracle Base
/u01/app/19.0.0/grid:/u01/app/oracle:OraGI19Home1:N:

Which is fine if you are going to use only the oracle user as the owner of both the GI software and the database software, but if you want to install the GI software as the grid user and the database software as the oracle user then you’ll need to modify the file as so:

#orabasetab file is used to track Oracle Home associated with Oracle Base
# 2019-09-29 Rich Soule updated the original value below to the new value
# so the GI could be installed as the grid user
#/u01/app/19.0.0/grid:/u01/app/oracle:OraGI19Home1:N:
/u01/app/19.0.0/grid:/u01/app/grid:OraGI19Home1:N:

If you don’t do this then there is a real good chance you’ll see the following when you attempt your GI install: Error 49802 initializing ADR

Good luck with your RAC installs…

Rich


X11 Forwarding with MIT Magic Cookies to Oracle Cloud as multiple users

Every time I do this, I always forget the steps, especially for allowing connections after you su to another user… So, here we go:

I’m going to connect two machines: rotor (it’s a palindrome), my local Windows machine, and cloudbox, my Oracle Cloud server.

The first thing I’m going to do is connect into cloudbox as the opc user, become root and then update the sshd_config file with an X11UseLocalhost no entry. I also commented the previous value (which was already commented out) and added a comment about who changed what and when. Then I’ll restart the ssh daemon.

[opc@cloudbox ~]$ sudo -s
[root@cloudbox opc]# vim /etc/ssh/sshd_config

# 2019-08-12 Rich Soule changed below to allow remote X11 Connections
#X11UseLocalhost yes ## Original value commented out as in this line
X11UseLocalhost no

[root@cloudbox ~]# systemctl restart sshd

Next, we’ll make sure that enough of the X11 tools are on the server.

[root@cloudbox ~]# yum install xclock -y

I’m going to use MobaXterm as my client on my Windows box and open up a local terminal on rotor. MobaXterm automatically gives me an X11 Server so I don’t have to use something like Xming to give me a local X11 Server.

When I created cloudbox, my Oracle Cloud server, a private key file was created. This private key file was saved to rotor, my Windows box, in a folder on my local computer, so I’m going to cd to that folder and start an ssh connection:

[Rich.Rotor] ➤ ssh -X -i id_rsa opc@cloudbox

At this point, I have to provide the passphrase for my private key file that is in this directory. After supplying the passphrase, I’ll be connected to cloudbox.

Last login: Mon Aug 12 22:23:10 2019 from somewhere on the internet
/usr/bin/xauth:  file /home/opc/.Xauthority does not exist
[opc@cloudbox ~]$

The message above should really be something like “.Xauthority does not exist, so I’m creating it.” because that is what just happened. The DISPLAY environment variable was set to the IP address of the cloud server with a :10.0 appended to the end, and we can see what ended up in .Xauthority file by using the xauth list command:

[opc@cloudbox ~]$ echo $DISPLAY
10.10.0.2:10.0
[opc@cloudbox ~]$ xauth list
cloudbox.myreg.myvcn.oraclevcn.com:10  MIT-MAGIC-COOKIE-1  6ab3d32cf1c543ecaf83c79297ee3fbc

At this point, X11 based commands will now work, but only for the opc user.

[opc@cloudbox ~]$ xeyes&
[1] 13177

xeyes

If I become another user, then X11 commands won’t work.

[opc@cloudbox ~]$ sudo su - oracle
Last login: Mon Aug 12 22:35:15 GMT 2019 on pts/0
[oracle@cloudbox ~]$ xeyes&
[1] 13595
[oracle@cloudbox ~]$ Error: Can't open display:

[1]+ Exit 1 xeyes
[oracle@cloudbox ~]$

Looking above, it appears that the DISPLAY environment variable for the oracle user has yet to be set. The blank line after the message essentially tells us it has a NULL value. However, even if we set it, it still doesn’t work yet.

[oracle@cloudbox ~]$ export DISPLAY=10.10.0.2:10.0
[oracle@cloudbox ~]$ xeyes&
[1] 14213
[oracle@cloudbox ~]$ X11 connection rejected because of wrong authentication
Error: Can't open display: 10.10.0.2:10.0

[1]+  Exit 1                  xeyes
[oracle@cloudbox ~]$

The trick at this point is to pass along the MIT Magic Cookie that got generated for the opc user to the oracle user. The easiest way to do this is to just copy and paste the full output from the xauth list command as the opc user into an xauth add command as the oracle user:

[oracle@cloudbox ~]$ xauth add cloudbox.myreg.myvcn.oraclevcn.com:10 MIT-MAGIC-COOKIE-1 6ab3d32cf1c543ecaf83c79297ee3fbc
xauth: file /home/oracle/.Xauthority does not exist
[oracle@cloudbox ~]$ xeyes&
[1] 14512
[oracle@cloudbox ~]$

At this point, everything works and X11 commands will now display on my local Windows box from Oracle Cloud as the oracle user.

Happy Linuxing!

 


Starting Oracle Database on Linux 7 using systemd and making OEM start only after the database is up

Recently I created a new virtual machine for the Oracle Database Admin, Install and Upgrade class that I teach at ACC. Previously I’d used Oracle Virtual Box on my local machine and then uploaded the image to Oracle Cloud and used Ravello to give each of my students their own server.

It was actually pretty straight forward:

  1. Upload the latest Oracle Linux (7.6) ISO that I got from eDelivery.oracle.com.
  2. Create a new blank machine with the following:
    1. 4 CPUs
    2. 24 GB of RAM
    3. 200 GB of disk
    4. Mout the uploaded ISO as a CD-ROM
    5. An elastic IP
    6. Services as follows:
      1. SSH (port 22)
      2. HTTPS (port 443)
      3. VNC (port 5901)
      4. HTTPS (port 7803 for Cloud Control)
  3. Start the image, configuring Oracle Linux 7 with the following:
    1. Server with GUI for software
    2. A static IP address (I used 10.0.0.15, but you could use anything).
    3. IP filtering so my ACC students could access the servers while they are in the labs at ACC and I could access the machines from home
    4. Partition the disk into a 16 GB swap partition, a 10 GB /home partition and then the rest of the disk as the root partition.
    5. When it came time to reboot the server, remove the CD-ROM image and update the configuration before rebooting so the image boots up using the disk.
  4. Install Oracle 12c and create an emrep repository database for OEM 13.3.
  5. Install OEM 13.3.
  6. Install Oracle 11g and create a database that will be upgraded during the course.

At this point everything was great, but since I teach 3-hour classes on Mondays and Wednesdays and shut the servers down between classes, my databases and OEM need to come up cleanly. Oracle has documentation on creating services to automatically start up databases on Linux/Unix, but it uses the old System V method for starting services (which, to be fair does still work on Linux 7). Since this was a Linux 7 server, I wanted to use the new systemd method. Tim’s rather fantastic site had the basic framework, but where he used scripts that he called from the service, I wanted to use dbstart and dbshut so that we could maintain startup and shutdown from a single file (/etc/oratab) rather than modifying a script.

I created the following file:

[root@dba ~]# vim /usr/lib/systemd/system/oracle-database.service
[Unit]
Description=The Oracle Database Service
After=syslog.target network.target

[Service]
# systemd ignores PAM limits, so set any necessary limits in the service.
# Not really a bug, but a feature.
# https://bugzilla.redhat.com/show_bug.cgi?id=754285
LimitMEMLOCK=infinity
LimitNOFILE=65535

Type=oneshot
RemainAfterExit=yes
User=oracle
Group=oinstall
Restart=no

ExecStart=/usr/bin/echo 'Starting Oracle Databases with Y in /etc/oratab'
ExecStart=/u01/app/oracle/product/12.2.0/dbhome_1/bin/dbstart /u01/app/oracle/product/12.2.0/dbhome_1
ExecStart=/usr/bin/echo 'dbstart has completed'

ExecStop=/usr/bin/echo 'Stopping Oracle Databases'
ExecStop=/u01/app/oracle/product/12.2.0/dbhome_1/bin/dbshut /u01/app/oracle/product/12.2.0/dbhome_1
ExecStop=/usr/bin/echo 'dbshut has completed'

[Install]
WantedBy=multi-user.target

I then enabled the service using the following:

[root@dba ~]# systemctl daemon-reload
[root@dba ~]# systemctl enable oracle-database

While the above worked great to start the database (tested with a reboot of the server), it didn’t address another issue. Unlike the database, Oracle Enterprise Manager comes with ‘out of the box’ scripts to start and stop OEM. They are the old style System V scripts that run out of /etc/init.d, and it didn’t really seem worth going through the trouble of converting them to the new systemd format. Unfortunately, the OEM scripts always assume that the database is already up and running. If your repository database is running on the same server as your OMS (which isn’t really that big of a deal if your hardware can handle it) that can be fixed by modifying the OEM startup script and adding in a ‘check to make sure your database is up and running before you start OEM’ section. The content in bold below was added to the out of the box OEM script after the initial comments in the file.

[root@dba ~]# vim /etc/init.d/gcstartup
# 2019-03-05 Rich Soule
# OEM should only startup if the emrep database is already up and running
# on the local machine so the below was added to make sure that happens.
#################### Begin Rich Soule Added Lines #######################
if [ "$1" = "start" ]
then
  counter=0
  while [ $counter -le 24 ]
  do
    ((counter++))
    if ! /usr/bin/ps -e | /usr/bin/grep -q ora_pmon_emrep
    then
      echo 'OEM is waiting on Oracle database to start'
      sleep 10
    else
      break
    fi
   done
   if [ $counter -ge 24 ]
   then
     echo 'Oracle database did not start in time, exiting OEM startup'
     exit 1
   fi
   echo 'Oracle database started, waiting 20 more seconds for database to open'
   sleep 20
  echo 'OMS will now attempt to start as per remainer of the /etc/init.d/gcstartup script'
fi
####################  End Rich Soule Added Lines ########################

The above first checks to make sure that gcstartup was called with the start argument. If so then we’ll check if there is an ora_pmon_emrep process running (emrep is the name of my OEM repository database). If it isn’t running, we’ll wait 10 seconds and check again, but only for 240 seconds total. Once that process is found, we break out of our while do loop. If we did hit our 240-second limit, then we exit out of the gcstartup script totally, otherwise, we wait 20 seconds for the database to open and then continue along with the rest of the gcstartup script.

So far this has been working like a charm.

Note that you could do the same type of thing with OEM and instead have the script make a connection to a remote server every 10 seconds to make sure that the remote OEM repository was up before attempting to start OEM.

Happy Linux-ing and Oracle-ing!


Party like it’s 1979!

After 120,000+ track miles on my Lotus Exige, I upgraded to a Radical SR3. (Reason: The Exige was too big and heavy. Reason for the Exige upgrade in 2006: The Miata I was driving was too big and heavy. It’s going to be tough going lighter than the Radical, but it is theoretically possible…) I still had the Exige sitting in the garage and, in theory, I’d be able to drive it when I needed to drive a car. In practice, I ended up driving my F350 back and forth two nights a week where I teach Oracle DBA courses as a way to give back to the community. (The rest of the time I work from my home office.)

Stacey and I decided that it would probably be better for the Exige to go to a new home since we were not really using it anymore. (Stacey daily drives and instructs in a 911). A fellow instructor purchased it, fixed it up a bit and turned it over to someone else. He already had an Exige that he tracked quite a bit and built a few times over, so he had the experience to fix my old car up. It’s nice to know that someone else is using the Exige as it is a truly great car.

However, this meant that I had no choice but to drive the F350 around. It’s an old truck (year 2000) with a lot of miles (190K+) which really isn’t that big of a deal to me. However it is a long bed/dually, so it is not that fun to park. I decided that I should get another car just to drive around in. The smart choice would have been to pick up an old Honda Civic or something for a few K. Who really wants to be smart though…

Enter another track friend, Mark. Mark had a bit of a car problem (too many cars, according to his wife at least) and he had a solution to my dilemma:

1979 Blue Bird Trans Am

That’s a number’s matching, 6.6L 1979 Bluebird Trans Am with a four-speed.

After getting approval from Stacey that this would be OK, Mark and I traded my cash for his car.

It’s been a bit of an adventure driving around in a 40 year old car, but it’s been fun.

I just had the original(!) driver’s seat cushions replaced with new foam by Nikito’s Upholstery here in North West Austin, and I’m about to go put the seat back in and take her for a drive.