September 8, 2010

Telehouse West

Telehouse West (February 2009)

When I first started my career in the Internet Service Provider (ISP) industry, I worked for a fledgling business ISP who had 2 racks of core equipment located in the Telehouse data centre in London Docklands. At the time Telehouse consisted of a single building, now known as Telehouse North, and although a prestigous location it was also slightly chaotic inside. These were days well before DSL, before all companies had a presence on the web, and back into the days when you needed a modem to connect to the Internet along with a dialup account that you had to pay for. The ISP Freeserve then came along and provided free dial-up access accounts and then it was not long before British Telecom started to roll out ADSL to the masses.

During these heady days of the Internet and the dot-com era we maintained 2 racks in Telehouse with various IT equipment and I would often be called upon to travel across London and install new equipment or connect new customer leased lines following a BT installation. Telehouse was much more informal during this time and, although security was still tight, you could pretty much run your own cables within your racks, between racks in the same suite, and between suites in the building. From a tenant perspective this was wonderful since you had the flexibility to install and connect when and where you wanted but in the long run this was a very bad thing indeed. In fact there are the ghosts of poor cabling practice still preserved around Telehouse North to this day. Huge over-tight cable looms still exist running through suites with large Telehouse warning signs on warning of a near death experience should you try and sneak anymore cables in; while some floor tiles have a disturbing wobble as they balance on packed cable runs under the floor. I would hate to think what would happen should your connection develop a fault but rely on something in that mess because you have no chance of tracing your specific cable.

Telehouse West (April 2009)

Thankfully Telehouse brought everyone into line with strict cable policies and then made headway into power usage since there was either no restriction on how much power you took per rack or Telehouse had no methodology on policing the power draw. Many of our customers were drawn to hosting in Telehouse because they knew that their power usage was not an issue, compared to the new breed of data centres that were very strict on power use and charged heavily, but there was a flip side to this: From time to time there will be outages in a data centre and those abusing the power policies would suffer the most. When your rack of 42 servers experiences a power related outage then it can be hard work getting all those servers back online since you cannot just power them all on at once. Conversely these were the very same customers who would bleat the loudest when their faulty server blew the power to the rack and then could not be powered on for a length of time because they had some much over use in the rack. I used to shrug my shoulders and say the same thing: We warned you that your rack usage and density was dangerous from a service perspective but you did nothing about it because you were happy to cram as much into the rack as humanly possible just to increase your profit margin.


Telehouse West (August 2009)

Telehouse completed the East building in 1999, which embraced controls over working practices and power management, and the demand for data centre space continued at break neck pace. The Docklands area is now littered with data centres; such as Global Switch, Sunguard, Telecity, and Redbus. Providing power to these facilities and providing premier hosting environments has become very costly and there is now a demand for slightly cheaper hosting space outside London (coupled with a need for disaster recovery space far from their primary hosting locations). We are now moving into the virtualised data centre arena as being pushed by BT but now I have digressed from talking about the new Telehouse building.

In March 2009 Telehouse announced that a new data centre facility was to be built on the existing side, to be named Telehouse West, and since I walk past the site each day on the way to work, I decided to take some photographs of the site as the building progressed. The first months went by with very little visible progress although work was very much underway on the foundations and across the whole site. I still pop into Telehouse occasionally so I would also check out the view from the bridge that connects the reception building to the North building. The building work put more restraints on the available parking at the facility but once the framework started to go up then the building progressed at a very fast pace.

Telehouse West (April 2010)

As a ‘key decision maker’ within our hosting business I was then invited over to Telehouse West for a tour as the first data suite was being primed for live active service. The site was still very much a construction site so boots and a hard hat were supplied and we started our tour by viewing the generator room, which is the smaller squat building on the above picture. The room was much like a large cavern at the time with generators primed along one side and a large empty space where more equipment was due to be installed at a later date. The lifts were not yet if full service and nor was the walkway from the reception building, so we joined the builders in using the stairways to make our way up to the first data suite floor. Unfortunately it’s very difficult to glamorise a data centre and if you have seen one room full of racking equipment then you have pretty much seen them all; everything tends to be grey, there are various power distribution units scattered around, and you have the gentle hum of the cooling in the background (plus the not so gentle sound of running IT equipment if the racks are populated).

The new building conformed to everything I would expect from a data centre although the unique selling point of this particular new building, which has been latched onto by the IT media such as Data Center Knowledge and Slashdot, is that excess heat generated by all the IT equipment will be used in a district heat network for the local Docklands community. That is not so much of a draw for people wanting to host their servers but it’s an interesting idea and good use of excess heat in a time when data centres are under even more scrutiny to do their part for the environment given the amount of power they use.

My only minor observations from the tour of Telehouse West:

1. Telehouse will allow tenants to select their own racks and install them based on a footprint and power cost. My experience is that you have to be very patient to operate this type of service model and you have to police installation closely. My personal preference is to provide a standard rack only, which is based on a healthy size to accomodate most IT equipment, and everyone has to have the same rack. From an aesthetic view this is much nicer since the suite becomes uniform and there are no odd shape racks nor different coloured racks splattered around the room. Again, this is just my personal preference where I am sure potential customers would much prefer the luxury to install their own purple racks as recommended by their equipment vendor.

2. Car parking at Telehouse has become very tricky in recent years and the new building has seen a temporary reduction is space, which has been negated by a temporary car park being setup on adjacent land, but once everything is complete then the overall car parking will be reduced from previous levels. I hope that something more permanent can be setup with the current vacant land adjacent to the facility otherwise I am sure there will be queues of traffic outside the entrance gates based on a 1-out-1-in policy.

Resizing the Broadworks Datastore (DSN)

At work I hold responsibility for an ageing Broadworks VoIP telephony platform that provides service for 2 of our offices, for our home and on-call engineers, and for a small grouping of customers. The platform runs on a Sun Solaris architecture although the software is now getting very old and we have no plans to update since there are plans afoot to replace the entire platform to bring us into the corporate telephony system and to move the customers over to a new Broadworks platform.

This morning a number of our internal users reported a problem trying to make updates via the GUI with the following pertinent error message buried within a long list of errors:

Data store space exhausted

Initially I checked all the Broadworks servers to look for space issues but none were found, which matched with the fact that none of our monitoring servers reported a disk space issue. The next step was to SSH into the application servers (as1 & as2) where the following error was immediately reported by Broadworks:

TimesTen temporary memory area is at 95% of total temporary size. (Currently in use size is at 95%. Allocated size is 16384, high water mark is 16062 and in use size is 15667.)
Increase your datastore temporary size area (using the resizeDSN tool)

This gave some more useful information so it was time to use some Google-fu and find some answers. Thankfully someone had experienced the very same problem very recently and had included a guide on his blog. So hats off to Mark Holloway for posting his entry Resizing the Broadworks Datastore (DSN). The rest of my guide is based on the article published by Mark along with some of the issues we experienced on the way.

The first step is to check the amount of available memory on the as1 & as2 servers. Our servers run Solaris so the following command was suitable for us:

bash-2.05$ prtconf | grep Mem
Memory size: 2048 Megabytes

The guidelines seem to suggest that the perm size shoud equal approx 25% of the physical memory and the temp size should equal approx 25% of the perm size allocation. We noted that nothing so far really revealed what the allocations were but we proceeded anyway but then found at step 7 that the system will show you the current settings before asking for the new values. In our case, after revewing the current memory allocation, we decided to leave the perm size alone and just slightly increase the temp size.

This is the type of output you will see at step 7 to give you an idea of how to check the current allocation and what the change request will look like:

——————–
Current Date:
Current Perm Size:  128
Current Temp Size:  16
Current Total Size: 144
——————–

Select the new database Perm size….
Available Perm sizes in MB (64 128 256 512 1024 1536 2048) [144

Current Date:
Current Perm Size:  128
Current Temp Size:  16
Current Total Size: 144
Target Perm Size:  128
Target Temp Size:  32
Target Total Size: 176
--------------------

Do you wish to proceed (y/n) [y]?

Below are the steps required:

1. SSH to as1 as bwadmin
2. stopbw
3. repctl stop
4. su as root
5. cd /usr/local/broadworks/bw_base/bin
6. ./timesten.pl unload
7. ./resizeDSN
8. exit (return to bwadmin)
9. repctl start
10. startbw

We found the Broadworks would not start properly straight away afterwards had 2 reported issues. The first related to the ‘Execution Server’:

——————————–
System Health Report Page
BroadWorks Server Name: as1
Date and time : Wed Oct 28 10:22:25 GMT 2009
Report severity : CRITICAL
Server type : AppServer
Server state : Unlock
——————————–

BroadWorks AppServer processes in trouble:

Execution Server not running

——————————–

Recommendations
—————

The Application Server needs to be restarted

——————————–

However, while trying to look around the system Broadworks generated abroadcast message to state that a start had been initiated:

bwadmin@as1$ Broadcast Message from bworks (console) on as1 Wed Oct 28 10:45:33…
===== BROADWORKS CONTROL — START INITIATED =====

The error message then changed to:

——————————–
System Health Report Page
BroadWorks Server Name: as1
Date and time : Wed Oct 28 11:00:33 GMT 2009
Report severity : CRITICAL
Server type : AppServer
Server state : Unlock
——————————–
Replication is not running for DSN AppServer. Databases may be out-of-synch.

File replication is not running.

——————————–

Recommendations
—————

Replication must be started (repctl start). If databases are out-of-synch they must be re-synchronized first (with the
importdb.pl tool). Please refer to the BroadWorks Maintenance Guide for detailed procedures.

Perform a file replication restart (repctl restart)

——————————–

We tried to restart replication as stated, which appeared to work, but then the same error would appear again. At this point we started to raise a support ticket with Broadsoft but by magic the error vanished and the system began to respond without any errors. It seems we had rushed through the changes so quickly that we had not allowed all the systems to start correctly and it was just a case of learning some patience. If in doubt just slow down and use the healthmon command to check on the status:

healthmon -l

The blog article then advises to wait 10 minutes before moving onto as2 so we popped down to the vending machine to shoot the breeze and catch up on the gossip.

Here are the steps we then used on as2:

1. SSH to as2 as bwadmin
2. stopbw
3. repctl stop
4. su as root
5. cd /usr/local/broadworks/bw_base/bin
6. ./timesten.pl unload
7. ./resizeDSN
8. exit (return to bwadmin)
9. importdb.pl AppServer as1 AppServer (replace as1 with your primary AS hostname or IP)
10. repctl start
11. startbw

We found that step 9 came with a big bag of fail attached so had to backup the database on as1, copy across to as2, and then manually import onto as2:

1. On as1:  bwBackup.pl AppServer dbBackup.db
2. scp the file to as2:  scp dbBackup.db bwadmin@as2:dbBackup.db
3. On as2: stopbw
4. repctl stop
5. bwRestore.pl AppServer dbBackup.db
6. repctl start
7. startbw

That dealt with our problem and our 2 servers were once again fully operational with the Helpdesk busy dealing with requests to make changes on the system.

I realise that I am just standing on the shouler of giants and without the original posting I would probably still be busy dealing with the Broadsoft support team at the moment (who are generally excellent btw).

proftpd – fatal: Socket operation on non-socket

I run my personal email and web on an old Sun Cobalt 550 server installed with Strongbolt (CentOS) Linux. It’s a good way to use end of life hardware but with a current Linux OS that has a small physical form factor and is easy to manage. My server automatically updates itself via yum but every now and then an error crops up with proftpd following an automatic update. My monitoring of the FTP port will go crazy and I will not be able to connect via FTP.

It’s no use trying to restart proftpd via the init script because it is an xinetd service. The following will occur whether xinetd is working properly or not:

[root@jenna ~]# /etc/rc.d/init.d/proftpd restart
Shutting down proftpd: [FAILED]
Starting proftpd: jenna – fatal: Socket operation on non-socket
[FAILED]

However, most people will try the above first because they’re used to using init scripts and a quick Google will reveal lots of people set to use xinetd but then told to swap config. That is not the correct course of action on a Cobalt box installed with Strongbolt because the system is meant to be using xinetd.

Strongbolt / CentOS uses xinetd for proftp and the automatic updates disable the xinetd settings:

[root@jenna ~]# cd /etc/xinetd.d/

[root@jenna xinetd.d]# vi xproftpd

Change this line:

disable                 = yes

To:

disable                 = no

Now restart xinetd:

[root@jenna xinetd.d]# /etc/rc.d/init.d/xinetd restart

That has mostly solved the problem for me before although I once had a completely blank xproftpd file in the same directory so had to restore from the xproftpd.rpmsave, which was in the same directory.

Simpla WordPress Theme Widgets

Wordpress

I used the Simpla WordPress theme for this site but found that it did not allow for the use of widgets on the sidebar. I crawled the InterTubes and found that the Typpz Blog experienced the same issues but contained a post showing how to make the required changes manually while also providing an updated theme with the changes included. I made my changes manually so I could better understand the mechanics of the site but if you’re considering the Simpla theme then you may want to just download the updated theme.

Here is the manual changes I made for reference:

cat /dev/null > sidebar.php

I wiped the contents of the sidebar.php file in the Simpla theme directory from command line. This can be done through the theme editor in WordPress but I prefer to work within Linux.

Then I pasted the below code into the now empty sidebar.php.

<div id=”sidebar”>
<ul>
<?php if ( !function_exists(‘dynamic_sidebar’) || !dynamic_sidebar() ) : ?>

<li id=”pages”>
<h2>Pages</h2>
<ul>
<li><a href=”<?php echo get_settings(‘home’); ?>/”>Home</a></li>

<?php wp_list_pages(‘title_li=’); ?>
</ul>
</li>
<li id=”categories”>
<h2>Categories</h2>

<ul>
<?php wp_list_cats(‘sort_column=name&optioncount=1&hierarchical=0′); ?>
</ul>
</li>
<li id=”links”>

<ul>
<?php get_links_list(); ?>
</ul>
</li>
<?php endif; ?>
</ul>
</div>

Once sidebar.php file was saved I then created a new file in the same directory called functions.php and entered the below code.

<?php
if ( function_exists(‘register_sidebar’) )
register_sidebar();
?>

That makes the theme widget friendly so you can now go crazy through the admin system. However, a minor change is required on the sidebar to deal with some creeping dots on the right sidebar. To correct this an addition is required in the style.css file.

Find this section in style.css:

#sidebar ul li{
border-bottom:1px dotted #ddd;
margin-bottom:0.3em;
padding:0.3em;
}

Change it to:

#sidebar ul li{
margin-bottom:0.3em;
padding:0.3em;
}
#sidebar ul ul li{
border-bottom:1px dotted #ddd;
}

Hats off to Typpz for providing the code and updated theme for everyone to enjoy!

Reset a Mailman List Password

Mailman

I’ve taken over ownership of a venerable Red Hat server that runs Mailman for a number of internal and customer mailing lists. Unfortunately much of our internal documentation for this box is poor and I needed to reset the admin password on a specific list to make some ammendments. With no recorded password I had to hunt around for a way to manually change the password from command line:

/usr/lib/mailman/bin/change_pw -l mail-list-name -p new-password