Linux Foundation Introduction to Linux Training On edX

I took the Intro to Linux course on edX. Here are my thoughts.

edX LogoThe Linux Foundation recently created an Intro to Linux course that they made available for free on the edX training platform. I decided to take the course and recently completed it. The course is still available and is free. (It is $250 if you want the “Verified Track” to prove you took, and passed, the course. The course is self paced and online.

My Linux background includes maintaining my own web server on Linux. I learned what I needed to (but not much more) and managed the server from the command line. I shut down my server a few months ago and figured this would be a refresher while filling in some gaps. Plus, this training is targeted to desktop users. My last desktop Linux usage was Mandrake Linux (now Mandriva) about a decade ago and I wasn’t too impressed.

As I mentioned, the course is self-paced and it’s estimated to take 40 to 60 hours. I didn’t time myself and since I did have experience my time isn’t a good indication anyway. The 40 to 60 hours seems reasonable for someone who is new to Linux although much of that time would be working the labs and exploring on your own.

The course uses three distributions for its examples: Ubuntu, CentOS and openSUSE.

The course is called an introduction and lives up to the name. There’s broad coverage that doesn’t go very deep. The course is divided into 18 chapters:

  1. The Linux Foundation
  2. Linux Philosophy and Concepts
  3. Linux Structure and Installation
  4. Graphical Interface
  5. System Configuration From The Graphical Interface
  6. Command Line Operations
  7. Finding Linux Documentation
  8. File Operations
  9. User Environment
  10. Text Editors
  11. Local Security Principles
  12. Network Operations
  13. Manipulating Text
  14. Printing
  15. Bash Shell Scripting
  16. Advanced Bash Scripting
  17. Processes
  18. Common Applications

Each chapter is sub-divided into sections that focuses on one area in the Chapter.

There is some good content within each section. But to really learn you’ll need to work on your own to see what does and doesn’t work. That said, if the goal is to pass the test at the end it’s not too difficult. All the needed material is in the course and there’s no time limit on finishing the final exam. Plus it’s not only open book, it’s open everything.

That ease of passing makes the course unsuitable for using it as proof of knowledge when looking for a job. At least in my opinion. I’d recommend against spending the $250 for the verified certificate which could be included on a resume. There may be exceptions, if you’re new to Linux and want a Linux related job it may help, but you’ll need to consider the course material a starting point and dig in some more.

The material is a mixture of short videos (I only recall one that was over 2 minutes), text and some do-it-yourself examples. The do-it-yourself examples really only enforce the material by having you do it. Some are just typing tests where you’re given the information you need. If you get it wrong you’re given the answer and then have to enter it again. There are a few questions at the end of each section.

Each Chapter has a lab for you to do on your own. The answers are also provided. This is the only part of the training where you need your own Linux installation. The do-it-yourself sections of the training are done online, not in your own Linux installation.

Bottom Line

It’s an introduction course so it can’t be faulted for being basic. I did learn a little. But if you have some self-taught Linux experience like I do, you aren’t going to learn a lot. I’m guessing I spent less than one hour per chapter, so there wan’t a big time investment and it was free. I’d recommend anyone think twice before spending the $250 for the verified track but the course is worth the time investment if you’re new to Linux and want someplace to start.

Adding A Trusted SSL Certificate

I’ve been using a self-signed certificate for years in order to encrypt my admin traffic to my websites. Some of my integrations have had problems with self-signed certificates, wanting a trusted certificate. Most had a work-around but I finally decided to bite the bullet, spend some money, and get a trusted certificate from a respected certificate authority. Here’s the setup, completed on a Sunday afternoon.

Up until now I’ve only used SSL on my web server to encrypt my WordPress admin traffic. A self-signed certificate was fine for that since I wasn’t worried about confirming the identity of my server. But recently I’ve had a couple situations where I wanted integration that required a trusted certificate (one issued by a recognized certificate authority). I’ve been able to implement work arounds but they seemed to compromise security, if only by a little. So I decided to go out a get an “official” SSL certificate.

I’ve been considering it for awhile. I’d basically decided on DigiCert to provide the certificate long ago. They came to my attention, along with a lot of information about certificates in general, while listening to Steve Gibson’s Security Now! podcast. In 2011 he was shopping for a new certificates provider and talked about it over several episodes over a couple months. They aren’t a bargain basement shop, but they aren’t outrageous either.

What I Have Now – What I Want

I have a VPS server with Linode with one IP address and a self-signed certificate used by all sites on that IP address. I want to get to a configuration where one of those sites uses a trusted certificate and the others continue to use the self-signed certificate.

Implementation – Pulling the Parts Together

1. First step is to add an IP address to my server for the new certificate. There are certificate types that can secure multiple websites on one IP address, but I’m only buying the certificate for one site. So I need one IP address for the self-signed certificate and one for the trusted certificate. Linode requires a support ticket to justify the new IP address. There were some forum threads that said this was a difficult process. Maybe it’s because I only need 1 extra IP, but it was fast and simple. I just said I was adding a SSL certificate. I had the new IP address about an hour later (on a Sunday).

2. I followed these instructions to add my second IP address once it was assigned. The only difference from the instructions was I did list the gateway for the second IP address and I needed to reboot the server to recognize the new IP address. Cycling network services wasn’t enough and in fact displayed a message that it might not work.

3. Next I needed to get the SSL certificate. I removed domain privacy on the domain for which I was getting the certificate in order to streamline the proof of ownership. I did restore privacy once the certificate was issued. I followed DigiCert’s instructions on Apache. Those instructions include a tool to create the Certificate Signing Request (csr). When I used the tool it recognized I wasn’t their customer and offered a 30% discount along with an extra 60 days. So if your planning multiple purchases be sure to try the big one first, the offer hasn’t come up since I bought my certificate. The tool generated a command line I had to run on my server to create the csr. After running the command I copied the generated .csr file to my local computer, opened in in a plain text editor and pasted the contents into the DigiCert certificate order form. Since it was Sunday, and DigiCert says they don’t offer support on Sunday (except for emergencies), I didn’t expect a response. Instead I go one a short time later asking for supporting documentation to prove who I am and I sent that along. I had the certificate about 2 hours after submitting the order.

Implementation – Adding the SSL Certificate

1. At the time I generated the CSR (step 3 above) it also generated my private key which needs to be protected. I’ll keep the private keys in /etc/ssl/private which is the default location on my Debian 6 server. So I copy the file there and make sure it’s only accessible by root.

sudo mv /path/to/file/www_mydomainname_com.key /etc/ssl/private
sudo chown root:root /etc/ssl/private/www_mydomainname_com.key
sudo chmod 600 /etc/ssl/private/www_mydomainname_com.key

2. I’ll keep the actual certificates in /etc/ssl/certs which is also the default location on my Debian 6 server. Once I received the certificate from DigiCert I copied it to my home directory on the server then moved it to the /etc/ssl/certs directory and made it accessible only to the root user.

sudo mv /path/to/file/www_mydomainname_com.crt /etc/ssl/certs
sudo chown root:root /etc/ssl/certs/www_mydomainname_com.crt
sudo chmod 600 /etc/ssl/certs/www_mydomainname_com.crt

So now the certificate is in place. It’s time time set up networking. I now have two IP addresses on the server, I’ll call them and where is the original IP that everything is configured to use. I use named hosts and use as the host that all my sites run on. I’ll want the newly trusted SSL site on and for convenience I’ll move the regular website to But since DNS takes time to propagate I’ll want the regular site to listen on both IP addresses for awhile.  Servers differ so your server may differ but this is how I did it.

1. Change the /etc/apache2/ports.conf file with configurations to listen on both IPs individually and both simultaneously.

NameVirtualHost *:80
Listen 80

2. Then for SSL I configure the SSL hosts as follows.

NameVirtualHost *:443

The NameVirtualHost *:443 statement is so that multiple sites can share an IP address with my self-signed certificate.

Then it’s time to edit the host file for the domain.

1. I change the <VirtualHost> statement to <VirtualHost *:80> so it runs on both IP addresses.

2. I change the SSL host statement from <VirtualHost *:443> to <VirtualHost> so it runs only on the new IP address. I will not be able to administer the WordPress site until DNS propagates but this isn’t a big deal.

3. I then add the necessary statements to the host file in order to enable the certificate:

SSLCertificateFile /etc/ssl/certs/www_mydomainname_com.crt
SSLCertificateKeyFile /etc/ssl/private/www_mydomainname_com.key
SSLCertificateChainFile /etc/ssl/certs/DigiCertCA.crt

The first two lines are the certificate and key files that I added earlier. That last line is the DigiCert intermediate certificate that needs to be added to the certificate chain. DigiCert provided the certificate when mine was sent. If I had multiple certificates from DigiCert they would all use this same certificate as the intermediate certificate.

5. Now it’s time to reload apache. At this point it will run the regular site on both IP addresses. WordPress administration (which uses port 443 for SSL) will only listen on the new IP address so I won’t have access to the admin console.

sudo /etc/init.d/apache2 reload

6. I update DNS to point the domain to the new IP address. I wait a little while for the DNS to propagate. DigiCert has a tool that will check the certificate installation. I ran it and it confirmed a valid installation. I could access the regular site and the SSL site without a problem.

7. I wait another day to allow the DNS change to propagate and then I change the regular site to only run on the new IP address by editing the Apache configuration one more time. I change the /etc/apache2/ports.conf file to listen on both IPs individually but not simultaneously.

Listen 80

I remove the ability to list on both IP address for the regular websites. I don’t change the SSL port configuration.

8. Then I edit the site file for the domain so that it runs on the new IP address.  I change the <VirtualHost *:80> statement to <VirtualHost> in the site file for the domain.

9. I reloaded Apache one last time.

sudo /etc/init.d/apache2 reload

At this point my trusted certificate is running on my website. I know longer get the untrusted site or invalid certificate warnings.

One benefit over the free certificates offered by some is that all my browsers already recognize DigiCert as a valid certificate authority that issues these certificates. No need to load another certificate in my browser.

While not a cheap process, it was surprisingly fast and easy.

Alternative PHP Cache (APC) on Debian 6

APC usage and hit chartAfter spending some time reviewing way to increase WordPress performance in ways beyond simple file caching I decided to give APC (Alternative PHP Cache) and extended try. I use mod_fastcgid on my server, as opposed to mod_fastcgi. Web searches indicated that mod_fastcgi would be a better choice with APC since it could share the same cache between processes. With mod_fastcgid each process has its own cache, which would also mean more memory usage. I have a relatively small server, a 512 MB VPS. But in looking at free memory it appeared I’d be OK as long as I was careful and had some upper limits. Also, I had has some issues getting mod_fastcgi to run in the past so I didn’t want to try that again.


APC is an opcode cache. Opcode caches will cache the compiled php script in memory so it does not have to be interpreted each time it is called. WordPress is built with PHP so this should help performance. (Technically “compiled” is probably technically wrong but does get the point across.) Other opcode caches are XCache and eAccelerator.

eAccelerator hasn’t been updated in a couple years so I was hesitant to try it. APC is by the same guys who develop PHP and it’s supposed to be rolled into PHP 6. So I decided to give it a try first. I may give XCache a try after running APC awhile. Although, if I’m honest, I probably won’t bother if APC works OK.  They can’t co-exist so I can only install one at a time.

The one reason not to use APC is memory. I’ll be improving cpu utilization and improving performance at the expense of memory.

Throughout this article the terms cache and caching refer to the opcode caching provided by APC.


APC is in the Debian 6 repositories so it can be installed from there. I use aptitude, so I run:

aptitude install php-apc

Once it’s installed I reload Apache and APC is running. No admin interface is running so I extract the apc.php file by running:

gzip -dc /usr/share/doc/php-apc/apc.php.gz > /path/to/my/mngmt/website/apc.php

The /path/to/my/mngmt/website is a website I have that’s a catch-all for various server related pages. Access to it is restricted and since it does include some version information and files paths you may want to restrict access. I can open this page and get statistics about APC usage.

When I opened apc.php and did a version check I saw that I was several versions behind. I install 3.1.3p2 from 2009 and the latest stable version was 3.1.9 fro May 2011 and there was a beta version from April 2012. APC was working and limited testing didn’t show any problems but I decided to update.

Update Through PECL

PECL is the PHP Community Extension Library so its installer will be installed along with packages it needs to compile APC. Finally APC will be installed.

sudo aptitude install php-pear php5-dev apache2-dev

Once these are installed I can install (upgrade) apc

sudo pecl install apc

Reloading Apache finishes things off. Because I had first installed apc through aptitude I didn’t have to do any manual edits.

Configuring APC

Configuring APC is where the real work is needed. I did several iterations testing different configurations. At this point I was testing for performance, I was looking to see what worked and what didn’t.

The apc.ini shows the settings I was most interested in and what I’ve currently settled on. Memory is the biggest issue with APC. As I mentioned, each FCGID process creates it’s own cache. I’ve configured my server for a maximum of three processes so I can control memory usage. The easy way to determine how much memory to allocate is to divide my free memory (without caching enabled) by 3 (the max number of FGCID process I’d have) and leave a little for OS buffers and breathing room. For me this worked out to about 100 MB (I had about 350 MB free. But that calculation is a limit, which doesn’t mean it’s a good size.

I’m going to jump ahead and show the configuration I settled on.My /etc/php5/conf.d/apc.ini contains the following entries:
apc.shm_segments = 1

The first two lines tell PHP to load the APC extension and enable caching.

I started off with a 256 MB cache and let it cache all my sites. This was much too large so I’d have to monitor usage in case things started swapping to disk. Cache usage seemed to level off at around 90 MB after a few hours. I did go through a load the home page of all my sites (many are test sites or receive little traffic). This gave me an rough (and lowball) idea of what I’d need to cache everything. So this was right at my 90 MB estimate. In practice a 90 MB cache size resulted in a lot of cache fragmentation and cache reloads which isn’t an optimal situation.

I took the approach of reducing my FCGID processes to 2 and increasing the cache to 128 MB. This provided better cache performance but shortly after midnight I received a series of “[warn] [client] mod_fcgid: can’t apply process slot for /home/rayn/public_html/fcgi-bin.d/php5-default/php-fcgi-wrapper” errors. That IP address is my server so some daily maintenance scripts needed more PHP processes than were available. Later in the morning one visitor experienced a timeout error. It may have been unrelated but timeout errors were previously unheard of. So I switched back to three FCGID processes.

At this point it was obvious I had to cache less than everything. There’s only two sites that have enough traffic to benefit from caching so I decided limiting caching to them. I handled this through the next two lines. apc.cache_by_default=0 turns off caching for all sites. It now needs to be turned on explicitly. It’s turned on through the apc.filters line. The + means the filter is to add the matching files to caching. (By default the filter removes matches from caching.) The syntax is a POSIX regular expression and can be comma delimited. I had trouble getting a comma delimited list to work although all indications are that it should. So I ended up combining them into this one expression. Here’s a regular expression cheat sheet. I cache this site and one subdomain on a second site.

Caching doesn’t have to be enabled for a site in order to use it with a WordPress plugin that supports APC. For example, W3 Total Cache will use the cache even if it’s not enabled for the site.

I specified apc.shm_segments = 1 even though 1 is the default setting. I don’t want later version to change the default and start eating more memory. I specify apc.shm_size=90 to create a 90MB cache per FCGID process.

I also specified apc.stat=1 even though that’s the default. There’s debate about whether setting this to 0 will improve performance. A setting of 1 means that APC will check each script on request to see if it’s been modified. If it’s been modified it will be re-cached. A setting of zero disables this feature, The official documentation recommends setting it to 0 on a production server. A setting of 0 can cause problems as PHP changes (such as WordPress plugin updates) would go unnoticed. Once things are solid I may change it to 0 to see if there’s an improvement or any problems. My initial feeling is any improvement would be unnoticeable for my site and wouldn’t be worth the hassle of needing to manually flush the cache when I make changes. I’m also concerned it will cause problems with some plugins.

To make sure memory usage doesn’t get out of control I’ve also limited my FCGID processes to only three and I don’t allow them to spawn any child processes. I also don’t terminate to FCGID processes since this would clear the cache. This could be a potential memory problem so I’ll have to monitor it.  The following settings are specified in my /etc/apache2/conf.d/php-fcgid.conf.

FcgidMaxProcesses 2

This should keep memory usage below the physical memory in my server. PHP_FCGI_MAX_REQUESTS is set to zero based on this in the Apache mod_fcgid documentation:

By default, PHP FastCGI processes exit after handling 500 requests, and they may exit after this module has already connected to the application and sent the next request. When that occurs, an error will be logged and 500 Internal Server Error will be returned to the client. This PHP behavior can be disabled by setting PHP_FCGI_MAX_REQUESTS to 0, but that can be a problem if the PHP application leaks resources. Alternatively, PHP_FCGI_MAX_REQUESTS can be set to a much higher value than the default to reduce the frequency of this problem. FcgidMaxRequestsPerProcess can be set to a value less than or equal to PHP_FCGI_MAX_REQUESTS to resolve the problem.

I’ll have to monitor this. It’s possible the FCGID memory could keep growing until physical memory is filled.

PHP_FCGI_CHILDREN is set to 0 to prevent children from being spawned. From the Apache documentation:

PHP child process management (PHP_FCGI_CHILDREN) should always be disabled with mod_fcgid, which will only route one request at a time to application processes it has spawned; thus, any child processes created by PHP will not be used effectively. (Additionally, the PHP child processes may not be terminated properly.) By default, and with the environment variable setting PHP_FCGI_CHILDREN=0, PHP child process management is disabled.

Additional Notes

The screenshot below (click to open full size) shows my status (using apc.php) a couple hours after the caching started. The screenshot was done when the cache was still configured for 256 MB. I am running a unique cache for each fdcgid process so refreshing the page moves between the instances (but randomly). The size of the three caches stay pretty close to one another but they are clearly unique caches.

APC Info Page

When running top each of my three fcgid processes levels off at about 25% memory usage. So if my math is right this is 75% of 512 MB. Yet free (and top) show 350 MB still free. If each was allocating the full 256 MB to its own cache then it would be well over 100%, assuming free and top saw the shared memory it was using. So I’m assuming free doesn’t see the shared memory and reports is as available. In my limited memory this makes sense since the shared memory is still available to any application. The actual usage of my three caches never grew about the available free memory.

APC does require some work to configure and optimize it. To start with I’d enabled it for all my sites and PHP. This resulted in a lot of cache fragmentation and a few cache reloads after a few hours.

Benchmarks and timings on a test server are fine for comparisons, but I’ll need to run a configuration for awhile to see how it does, then start tweaking it. Right now my server seems healthy. Using APC did improve performance over no caching of any type, but I’m not sure of it’s benefit once page caching is used, especially if using a preloaded cache.

Right now my main concern is that I have too few FCGID processes, only two. I used some load testing tools and it didn’t cause any errors. I do have page caching enabled to PHP shouldn’t be needed when serving a page from the cache. I’ll be monitoring my error logs.

Anyone else with experience using APC with WordPress?

Apache File Does Not Exist Error

I was getting a “File does not exist: /etc/apache2/htdocs” error in my default Apache error log. The error appeared every 5 minutes. All my sites were fine but this was getting annoying so it was time to do something about it.

Apache logoI spent some time over the past week killing some bugs and making some adjustments to my server. One of them was a message being logged to my default Apache error log (/var/log/apache2/error.log). The error was:

[Sun May 27 22:05:03 2012] [error] [client] File does not exist: /etc/apache2/htdocs

The error was being logged exactly on every five minute mark, no matter when the server was started. The interesting parts here are that none of my sites are configured to listen on the loopback address ( although some are set to run on any addresses. Also, /etc/apache2 is not the document root for any of my sites. All of my sites were working and were properly configured. A Google search showed I was not alone with the problem but had no good solutions

What I ended up doing is specifying a default document root for a directory that does exist. I used the default site for the server.

sudo nano /etc/apache2/conf.d/DefDocRoot

I put in one line:

DocumentRoot /valid/path/to/site/public

I reloaded the Apache configuration and the error went away. It seems like there’s a default site of some type configured into Apache. It might be something I configured but apache2ctl –S doesn’t show any syntax errors and all my sites seem fine.

Adjusting Logrotate and Lessons Learned Redux

A syntax error in a logrotate configuration file was breaking Apache for me. So I dug into the logrotate man page and made some other changes to my logrotate strategy, including date stamping files when they are rotated. This is the logrotate configuration for my Apache virtual hosts.

Apache logoBack in October I had some issues with logs and adjusted logrotate. This weekend I made additional changes while killing some website bugs and resolving annoyances.

By default logrotate appends a numeral to the file name as it’s rotated. (At least on Debian 6). There is a configuration file parameter dateeext that will append the date instead.  There’s also a parameter dateformat if the default extension isn’t suitable.

I also had an issue where logrotate failed when I deleted a test website but didn’t update the logrotate configuration to remove its logs. I added the missingok configuration parameter to handle this in the future.

I also changed logrotate to rotate files weekly or when the size exceeds 100MB. I’ll see how this works out and adjust it if needed.

I also had a self-inflicted wound where I left out the opening curly bracket. This caused it to rotate apache2ctl, causing my apache problems. I was probably sloppy in my testing and made a quick change after testing and somehow deleted the bracket.

To test logrotate the command: sudo /usr/sbin/logrotate -vfd /etc/logrotate.d/config_file_to_test can be used. The –vfd switch means to run in verbose mode, force logrotate even if it’s  not needed, and run in debug mode. Debug mode means no logs will actually be rotated. Sudo is not needed if logged on as the root user.

The file /var/lib/logrotate/status is logrotates memory of when files were last rotated. Deleting a file’s entry from here will force it to be acted upon the next time logrotate runs. It’s also a good place to see what files are being rotated.

So, the logrotated configuration for my websites is now:

rotate 5
size 100M
/usr/sbin/apache2ctl graceful > /dev/null

This rotates the logs weekly (or when the reach 100MB) and saves the last 5 logs. I experimented with mailing the logs when it came time to delete them but decided it would be too many emails and it would be easier for me to grab the files once a month. I may increase from 5 the number of logs I save to give me some extra time to grab them.