Thursday, April 25, 2013

Oracle data appliance validate failure on network bond2 does not have salve


Got the following error :


/opt/oracle/oak/bin/oakcli validate -a -f /tmp/validateAll.t 


...

ERROR: Bond interface bond2 has 0 slave interfaces, expected 2 interface
WARNING: Bond interface bond2 has the following current status:down
RESULT: Bond interface bond2 is down configured in mode:fault-tolerance (active-backup) with current active interface as None
...

Not sure what happened. I made some change during a testing to bring up bond2 with IP assign to it. Only /etc/sysconfig/network-scirpts/ifcfg-bond2 got changed with IP/Subnet/Mask information. ifcfg-eth6 and eth7 were not touched and associating with bond2. 

After checking, found the eth6 and eth7 are down. which is the reason I got this error. 

ifconfig -a

...

bond2     Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)


...

eth6      Link encap:Ethernet  HWaddr A0:36:9F:08:E3:9F
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:25 Memory:df2a0000-df2c0000

eth7      Link encap:Ethernet  HWaddr A0:36:9F:08:E3:9E
          BROADCAST MULTICAST  MTU:1500  Metric:1
          RX packets:8512319 errors:0 dropped:0 overruns:0 frame:0
          TX packets:867912785 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1389865582 (1.2 GiB)  TX bytes:388587192156 (361.9 GiB)
          Interrupt:26 Memory:df2e0000-df300000


Bring up eth6 and eth7 


ifdown eth6
ifdown eth7

ifup eth6
ifup eth7
ifconfig -a
...
bond2     Link encap:Ethernet  HWaddr A0:36:9F:08:E3:9F
          BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:8512319 errors:0 dropped:0 overruns:0 frame:0
          TX packets:867912785 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:1389865582 (1.2 GiB)  TX bytes:388587192156 (361.9 GiB)

...
eth6      Link encap:Ethernet  HWaddr A0:36:9F:08:E3:9F
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:25 Memory:df2a0000-df2c0000

eth7      Link encap:Ethernet  HWaddr A0:36:9F:08:E3:9F
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:8512319 errors:0 dropped:0 overruns:0 frame:0
          TX packets:867912785 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:1389865582 (1.2 GiB)  TX bytes:388587192156 (361.9 GiB)
          Interrupt:26 Memory:df2e0000-df300000


Validate agin. with oakcli. All good.
...

WARNING: Bond interface bond2 has the following current status:down
RESULT: Bond interface bond2 is down configured in mode:fault-tolerance (active-backup) with current active interface as None
                Slave1 interface is eth6 with status:down Link fail count=0 Maccaddr:a0:36:9f:08:e3:9f
                Slave2 interface is eth7 with status:down Link fail count=0 Maccaddr:a0:36:9f:08:e3:9e
...

Friday, April 12, 2013

/etc/resolv.conf got overwritten in Oracle EL6


To resolve this problem. There are two options.

1. Use network manager interface to change the interface setting.
System-> Perferences-> Network Connections -> Wired tab -> [InterfaceName]-> Edit-> IPV4 Settings Tab->
Change DNS Servers
Change search domains

2. Use the the following manual change
In the file: /etc/sysconfig/network-scripts/ifcfg-<iface>

You need to add all of your specific resolv.conf entries, such as:

DNS1="216.239.32.10"
DNS2="216.83.130.2"
DNS3="216.83.130.7"
DOMAIN="mydomain.com"
SEARCH="mydomain.com. yourdomain.com. otherdomain.com.

Wednesday, April 10, 2013

Steps to setup apache zookeeper

Setup a 2 node zookeeper cluster

1. Download zookeeper-3.4.5.tar.gz. tar xvf this file. 

2. Go to zookeeper-3.4.5/conf directory

vi zoo.cfg

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
# Change to your repository location.
dataDir=/u01/hadoop/zookeeperCluster
# the port at which the clients will connect
# Changed from default port 2181, My port is used other program
clientPort=12181
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
#autopurge.snapRetainCount=3
# Purge task interval in hours
# Set to "0" to disable auto purge feature
#autopurge.purgeInterval=1
#My 12888 and 13888 port are used other program
server.1=host1:12888:13888
server.2=host2:12888:13888

3. Depend on your log4j.properties setting. Find the logfile location. 

4. Edit myid in the dataDir (dataDir=/u01/hadoop/zookeeperCluster)
In host1 vi myid put "1" in it.
In host2 vi myid put "2" in it.

5. Start server 
host1 :
cd ${zooKeeperHome}
 ./bin/zkServer.sh start

host2:
cd ${zooKeeperHome}
 ./bin/zkServer.sh start

6. Test client connection 
 bin/zkCli.sh -server host1:12181
ls

[zk: host1:12181(CONNECTED) 0] ls /
[zookeeper]






flume java.lang.ClassNotFoundException: org.apache.hadoop.io.SequenceFile$CompressionType

To resolve the following error
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.io.SequenceFile$CompressionType
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:356)

You need to copy hadoop-core-x.x.x.jar to ${flume_home}/lib directory.

This file can be found under your hadoop home directory. 

e.g. 
hadoop 1.0.4 installed in "/home/hadoop/hadoop-1.0.4"
hadoop-core-1.0.4.jar is in this directory. 

cp /home/hadoop/hadoop-1.0.4/hadoop-core-1.0.4.ja ${flume_home}/lib should be able to resolve this problem. 


Tuesday, April 9, 2013

setup mongo sharding, a sample configuration and testing sharding behavior.


Terminal 1
mongod --port 27022 --dbpath /config --configsvr

Terminal 2
mongos --configdb localhost:27022 --port 27034 --chunkSize 1

Termail 3
mongod --port 27023 --dbpath /app/mongo/shards/shard0/data --shardsvr

Termail 4
mongod --port 27024 --dbpath /app/monngo/shards/shard1/data -shardsvr

Connect to mongos
mongo localhost:27034


mongos>  db.runCommand({addshard : "localhost:27023",allowLocal: true})
{ "shardAdded" : "shard0000", "ok" : 1 }
mongos> db.runCommand({addshard : "localhost:27024",allowLocal: true})
{ "shardAdded" : "shard0001", "ok" : 1 }

mongos> testdb = db.getSisterDB("testdb");
testdb
mongos> db.runCommand({ enablesharding:"testd"})
{ "ok" : 1 }
mongos> db.runCommand({ enablesharding:"testdb"})
{ "ok" : 1 }

mongos> db.runCommand({ enablesharding:"testdb"})
{ "ok" : 0, "errmsg" : "already enabled" }



mongos> db.runCommand({shardcollection : "testdb.testcollection", key : {testkey : 1}})
{ "collectionsharded" : "testdb.testcollection", "ok" : 1 }



Use the following java to insert 200000+ rows into new db 

package dbHelper;  
import java.util.Random;
import com.mongodb.*;

import java.net.UnknownHostException;

public class InsertTester {
MongoHelper mh; 
public InsertTester (String p_hostname, int p_portnumber , String p_username, String p_password , String p_dbname) {
try {
mh = new MongoHelper(p_hostname,p_portnumber,p_username,p_password,p_dbname);
}
catch ( Exception e ) {
e.printStackTrace();
}
}
public static void main(String[] args) {
// TODO Auto-generated method stub
InsertTester tester=new InsertTester ("cvlqmongo1",27034,null,null,"testdb");
int testKey=0;
Random rd = new Random();
BasicDBObject myobj = null;
for ( int i=0; i < 10000 ; i++ ) {
testKey = rd.nextInt();
myobj=new BasicDBObject("testkey",testKey);
myobj.append("Content", "WhatEver");
   tester.mh.addContents("testcollection", myobj);
   
}
}

}

Row counts 
mongos> db.testcollection.count();
20342

Numbers in shr0 and shr1 are on pair. Almost 10000 on each node. 


Add another shard

Termial 5
mongod --port 27025 --dbpath /app/mongo/shards/shards/shard2 --shardsvr 

mogos localhost:27034
db.runCommand( { addshard : "localhost:27025", allowLocal:true});

mongos> use admin
switched to db admin
mongos> db.runCommand({listShards:1});
{
        "shards" : [
                {
                        "_id" : "shard0000",
                        "host" : "localhost:27023"
                },
                {
                        "_id" : "shard0001",
                        "host" : "localhost:27024"
                },
                {
                        "_id" : "shard0002",
                        "host" : "localhost:27025"
                }
        ],
        "ok" : 1
}


Execute testing program again.  insert 20000+ more rows. Node 3 only has 2 rows populated. 

mongos> use testdb
switched to db testdb
mongos> db.testcollection.count();
40342

Node 1: 
mongo localhost:27023
> use testdb
switched to db testdb
> db.testcollection.count();
19957


Node 2: 
 mongo localhost:27024
> use testdb
switched to db testdb
> db.testcollection.count();
20383

Node 3: 
 mongo localhost:27025
> use testdb
switched to db testdb
>  db.testcollection.count();
2






Monday, April 8, 2013

DYNAMIC_REGISTRATION_LISTENER = OFF cause listener not work correctly

     Several my Oracle databases suddenly has invalid objects due to db link not longer working. The error message is like "ORA-12514: TNS:listener does not currently know of service requested in connect descriptor".

     It turned out, someone implemented  DYNAMIC_REGISTRATION_LISTENER = OFF on my source database without putting descriptions regarding what this listener should be handling. As the result, databases services and instances are no longer registered with default port listener as default. The listener ended up running without doing anything. 

     To fix it. The following description need to be added in order to tell the listener to whom it services. 

SID_LIST_LISTENER =
  (SID_LIST =
    (SID_DESC =
      (GLOBAL_DBNAME= service.company.com)
      (ORACLE_HOME=$ORACLE_HOME)
      (SID_NAME =somesid )
    )
  )

Sunday, April 7, 2013

resolv.conf get overwritten in Oracle Enterprise Linux Server 6 EL6


/etc/resolv.conf get overwritten in EL6


1. Use network manager interface to change the interface setting.
System-> Perferences-> Network Connections -> Wired tab -> [InterfaceName]-> Edit-> IPV4 Settings Tab->
Change DNS Servers
Change search domains

2. Use the the following manual change
In the file: /etc/sysconfig/network-scripts/ifcfg-<iface>

You need to add all of your specific resolv.conf entries, such as:

DNS1="216.239.32.10"
DNS2="216.83.130.2"
DNS3="216.83.130.7"
DOMAIN="mydomain.com"
SEARCH="mydomain.com. yourdomain.com. otherdomain.com."