Linux shell commands 104

1. regular expression
2. grep
3. cut & concatenate
4. sed
5. awk
6. Misc
- 6.1. Parsing email address or url from text
- 6.2. delete a sentence containing a word

Chapter 4 About Text Edit

1 regular expression

1.1 基础的正则表达式

regex	Desription
^	start of line
$	end of line
.	matches any one character
[]	matches any one char in [chars]
[^]	matches any one char EXCEPT in [chars]
[-]	matches any char within range [chars]
?	matches one or zero times
+	matches one or more times
*	matches zero or more times
()	substring as one item to match
{n}	match n times
{n,}	match at least n times
{n, m}	match n to m times
\|	alternation, OR
\	escape

1.2 POSIX 字符集

regex	Desription
[:alpha:]	alphabet
[:digit:]	digit
[:alnum:]	alp & number
[:lower:]	lowercase
[:upper:]	uppercase
[:punct:]	punctuation
[:blank:]	space & tab
[:space:]	whitespace

1.3 Perl-style

regex	Desription
\b	word boundary
\B	non-word boundary
\d	single digit
\D	single non-digit
\w	single word
\W	single non-word
\n	newline
\s	single whitespace
§	single non-space
\r	return

# IP address
[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}
[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.[[:digit:]]{1,3}

2 grep

The master unix utility for searching in the text.

grep PATTERN FILE
# Extented regular expression
grep -E PATTERN
egrep PATTERN
# Only matched portion
grep -o -E PATTERN
# Except lines containing PATTERN
grep -v PATTERN FILE
# Count number of lines
grep -c PATTERN FILE
# Recursively
grep -R -n DIRE
# Byte Offset
grep -b -o PATTERN FILE
# Locate matched pattern
grep -l PATTERN FILE
# Locate non-matched files
grep -L PATTERN FILE
# Ignore case of pattern
grep -i PATTERN FILE
# Multi patterns
grep -e PATTERN1 -e PATTERN2 FILE
# Print 4 lines After matched pattern
grep PATTERN -A 4
# Print 4 lines Before matched pattern
grep PATTERN -B 4
# Print 4 lines that matched pattern as Center
grep PATTERN -C 4
# include & exclude
grep PATTERN DIRE -r --include *.{c,cpp}
grep PATTERN DIRE -r --exclude "readme"

3 cut & concatenate

Column-wise cutting of a file

# Field 2 in file
cut -f2 FIELD FILE
# Bytes 1st to 5th
cut -b1-5 BYTE FILE
# Character since 3rd
cut -c3- CHAR FILE
# Delimiter
cut FILE -c1-3,5- --output-delimiter ","

Column-wise concatenate of files

paste FILE1 FILE2
# Delimiter
paste FILE1 FILE2 -d ','

4 sed

Stream editor

# First occurrence of pattern in each line
sed 's/PATTERN/REPLACE/' FILE
# Write changes Into file
sed -i 's/PATTERN/REPLACE/' FILE
# Global replace
sed 's/PATTERN/REPLACE/g' FILE
sed 's:PATTERN:REPLACE:g' FILE
sed 's|PATTERN|REPLACE|g' FILE
# Global replace since N+1th occurrence
sed 's/PATTERN/REPLACE/2g' FILE
# Delete blank lines
sed '/^$/d' FILE
# Delete line
sed '/PATTERN/d' FILE
# Matched string notation & -- \w\+
sed 's/\w\+/[&]/g' FILE
# Matched substring notation \1 -- \(PATTERN\)
sed 's/STRING\(PATTERN\)/\1/'
# Multiple expressions
sed 'exp' | sed 'exp'
sed 'exp; exp'
# Quoting
sed "s/$VAR/REPLACE/"

5 awk

Data streams

awk 'BEGIN {statements} {statements} END {end statements}'
awk "BEGIN {statements} {statements} END {end statements}"
# Print 3rd & 2nd field of every line
awk '{ print $3, $2 }' FILE
# Count number of lines, NR -- records, NF -- fields
awk 'END{ print NR}' FILE
# Variable passed from outside to awk
awk -v VARI=$VAR '{print VARI}'
awk '{print v1, v2}' v1=$var1 v2=$var2 FILE
# Filtering lines
awk '/START_PATTERN/, /END_PATTERN/' FILE
awk 'NR==1,NR==4' FILE
awk 'NR < 5' FILE
awk '!/linux/' FILE
# Setting delimiter
awk -F: '{ print $NF }' /etc/passwd
awk 'BEGIN { FS=":" } { print $NF }' /etc/passwd

6 Misc

6.1 Parsing email address or url from text

egrep -o '[A-Za-z0-9]+@[A-Za-z0-9]+\.[a-zA-Z]{2,4}' FILE
egrep -o "http://[A-Za-z0-9]+\.[a-zA-Z]{2,3}" FILE

6.2 delete a sentence containing a word

# [^.]* -- any char except ., and comb of it any times
sed 's/ [^.]*PATTERN[^.]*\.//g' FILE

Table of Contents

1 regular expression

1.1 基础的正则表达式

1.2 POSIX 字符集

1.3 Perl-style

2 grep

3 cut & concatenate

4 sed

5 awk

6 Misc

6.1 Parsing email address or url from text

6.2 delete a sentence containing a word