Skip to main content

Working with Regular Expressions


Introduction
Regular expressions are a way of manipulating text by using character strings and metacharacters. They are often used by programmers, but can also be used in shell commands and in some programs (e.g. vi and other text editors). The examples that follow are mainly based upon the regular expression format used by perl. They are also the same as those available in PHP using the preg regular expression functions. Some programs / languages may use different syntax or meta-characters but the general concept is the same.
Getting started
Regular expressions can initially be compared with wild-characters that you may already be familiar with, but regular expressions are much more powerful. For example most people are familiar with the ? and * when used on the command line to mean: match any single character, and match any number of characters, respectively.
To match a using a regular expression the expression is first contained within forward slashes. For example:
/Linux/ would match on the word Linux, which could occur anywhere in the string being tested.

Note that whilst the use of forward slashes is most common it is often possible to replace these with any other single character. This is useful if using a path name or url that contains slashes that would otherwise need to be escaped.
Replacing characters or strings
Regular Expressions are often used to find and replace text so the example:
s/Linux/UNIX/
would replace the word Linux with UNIX (useful if you wanted to convert some documentation to a more generic UNIX reference rather than restricting to Linux. The "s" means substitute, in the match earlier there is an implied "m" character meaning match. Depending upon the program / language being used you may or may not need to include the "s" or "m" characters.

Modifiers and special characters
There are also a number of modifiers that can be added to the end of the regular expression. The two most common are:
g which means match globally, ie. replace all occurrences in the string rather than stopping after the first one (which is probably what we really wanted for the ealier example).
i means ignore case (or perform a case-insensitive match).

So our earlier example could become:
s/Linux/UNIX/gi
to replace for all occurrences regardless of case.

The above examples are nothing more than a basic find and replace, but the real power of regular expressions is when you include meta-characters and other special commands. Some metacharacters are:
\ Escape the character following it (ie. ignore special properties of the following character)
. Match any single character except new line
^ Match at the beginning of the string, or if the first character in square brackets matches elements not in the list
$ Match at the end of the string
* Match the preceding element 0 or more times
? Match the preceding element 0 or 1 time
{...} Range of occurrences for the preceding element
[...] Match any of the characters between the brackets
(...) Groups regular expressions often leaves values in $x - where x is the number of the bracket pair (staring at 1)
| Matches either the preceding or following character/expression

There are also some escape sequences / character classes that can be used:
\n Newline
\r Carriage return
\t Tab
\f Formfeed
\d Match a digit e.g. [0-9]
\D Match a non-digit e.g. [^0-9]
\w Match a word character (alphanumeric) e.g. [a-zA-Z_0-9]
\W Match a non-word character e.g. [^a-zA-Z_0-9]
\s Match a whitespace character e.g. [ \t\n\r\f]
\S Match a non-whitespace character e.g. [^ \t\n\r\f]

When using square brackets, some characters can have different meanings depending on whether they are inside our outside the brackets. A period '.' outside the brackets will match on any character, but inside will only match on a period character.
Examples
/[df]og/ Matches dog or fog
/^The/ Matches "The" if it is at the beginning of the string
/^The end$/ Matches "The end" if that is the entire string
/The\s*end/ Matches "The end" and "The        end" but not "The - end"
/The.*end/ Matches "The end" as well as "The book will soon be coming to an end."


Comments

Popular posts from this blog

Shell Script: Find Number Of Arguments Passed

Many times , when we create shell scripts we try to do repetitive tasks through functions. Some functions take arguments & we have to check the no. of arguments that are passed to it.

Each bash shell function has the following set of shell variables:
[a] All function parameters or arguments can be accessed via $1, $2, $3,..., $N. [b] $* or $@ holds all parameters or arguments passed to the function. [c] $# holds the number of positional parameters passed to the function. [d] An array variable called FUNCNAME ontains the names of all shell functions currently in the execution call stack. ExampleCreate a shell script as follows: #!/bin/bash # Purpose: Demo bash function # ----------------------------- ## Define a function called test() test(){   echo "Function name:  ${FUNCNAME}"   echo "The number of positional parameter : $#"   echo "All parameters or arguments passed to the function: '$@'"   echo }
## Call or invoke the function ## ## Pass the parameters or a…

AMD Radeon™ HD 7670M on Ubuntu 12.04

Update:  Recently I install kubuntu 13.10 and there is no problem with graphics. It just works  fine out of the box.
I've seen many blog posts on how to make AMD HD7670M work on Ubuntu 12.04, specially when its in switchable graphics board like Dell Inspiron 15R 5520. I tried many things to make it work so that I could use the cinnamon desktop on ubuntu & other things too.. But to my surprise even the drivers from AMD site didn't work.
Then I tried a combination of those blog posts I read & somehow I became successful in running the full graphics including compiz settings inside My Ubuntu Machine.
Following are the steps I followed & it worked...
1. Create a backup of your xorg configuration file:
sudo cp /etc/X11/xorg.conf /etc/X11/xorg.conf.BAK
2. Remove/purge current fglrx and fglrx-amdcccle :
sudo apt-get remove --purge fglrx*
3. Install the driver:
sudo apt-get install fglrx fglrx-amdcccle
4. Install additional components for advanced graphics:
sudo apt-get install xvba-…

CentOS / Redhat : Configure CentOS as a Software Router with two interfaces

Linux can be easily configured to share an internet connection using iptables. All you need to have is, two network interface cards as follows:
a) Your internal (LAN) network connected via eth0 with static ip address 192.168.0.1
b) Your external WAN) network is connected via eth1 with static ip address 10.10.10.1  ( public IP provided by ISP ) Please note that interface eth1 may have public IP address or IP assigned by ISP. eth1 may be connected to a dedicated DSL / ADSL / WAN / Cable router: Step # 1: Enable Packet ForwardingLogin as the root user. Open /etc/sysctl.conf file # vi /etc/sysctl.conf
Add the following line to enable packet forwarding for IPv4: net.ipv4.conf.default.forwarding=1
Save and close the file. Restart networking: # service network restart
Step # 2: Enable IP masquerading
In Linux networking, Network Address Translation (NAT) or Network Masquerading (IP Masquerading) is a technique of transceiving network traffic through a router that involves re-writing the source and/or d…