[Unix-Linux] Text Processing

Date:     Updated:

Categories:

Tags:

📋 This is my note-taking from what I learned in the Unix/Linux Tutorial!


Text Processing Commands

Text Processing Commands

wc

cat chap1.txt
chapter 1-0 - Mary have a little lamb, its fleece was as white as snow
chapter 1-1 - Little Jack Horner Sat in the corner Eating a Christmas pie; He put in his thumb And pulled out a plum And said, “What a good boy am I!”
chapter 1-2 - Jack be nimble Jack be quick Jack jump over the candlestick
chapter 1-3 - Itsy bitsy spider climbed up the waterspout. Down came the rain and washed the spider out.
chapter 1-4 - Hot cross buns! Hot cross buns! One a penny, two a penny, Hot cross buns!
chapter 1-5 - Hickory, dickory, dock, The mouse ran up the clock. The clock struck one, The mouse ran down! Hickory, dickory, dock.
chapter 1-6 - Hey, diddle, diddle, The cat and the fiddle, The cow jumped over the moon
chapter 1-7 - Goosey, goosey, gander, Wither shall I wander? Upstairs and downstair And in my lady’s chamber.
chapter 1-8 - Georgie Porgie, pudding, and pie, Kissed the girls, and made them cry. When the boys came out to play, Georgie Porgie ran away.
chapter 1-9 - Cobbler, cobbler, mend my shoe. Get it done by half past two.%
wc chap1.txt
cp chap1.txt chap2.txt
wc chap1.txt chap2.txt
head -5 chap1.txt > chap2.txt
wc chap1.txt chap2.txt
wc -l chap1.txt

$PS

echo $PS0
echo $PS1
echo $PS2
PS1=\w
wls
chap1.txt chap2.txt
wPS2=’\w ‘ → w is working directory
wPS

tee

cat chap2.txt | tee f1 f2 f3
ls -l
cat chap2.txt | tee -a f1

tr = translate or delete

echo Hello World! | tr [a-z] [A-Z] #HELLO WORLD!
echo Hello World! | tr o O #HellO WOrld!
echo Hello World! | tr "o" "a" #Hella Warld!
echo Hello World! | tr " " "\t" #Hello  World! -> \t is tab!, \n is next line
echo Hello World! | tr -d " " #HelloWorld!
echo Hello World! | tr -d [:punct:] #Hello World
echo Hello World! | tr -cd [:punct:] #! -> cd means "keep only punct(!)"
echo Hello World! | tr -s [:alnum:] #Helo World!
echo Mary had a little lamb {sheep} | tr "{}" "()" #Mary had a little lamb (sheep)
  • stop → ctrl + C
  • finish file → ctrl + D

grep

cat chap1.txt
chapter 1-0 - Mary have a little lamb, its fleece was as white as snow
chapter 1-1 - Little Jack Horner Sat in the corner Eating a Christmas pie; He put in his thumb And pulled out a plum And said, “What a good boy am I!”
chapter 1-2 - Jack be nimble Jack be quick Jack jump over the candlestick
chapter 1-3 - Itsy bitsy spider climbed up the waterspout. Down came the rain and washed the spider out.
chapter 1-4 - Hot cross buns! Hot cross buns! One a penny, two a penny, Hot cross buns!
chapter 1-5 - Hickory, dickory, dock, The mouse ran up the clock. The clock struck one, The mouse ran down! Hickory, dickory, dock.
chapter 1-6 - Hey, diddle, diddle, The cat and the fiddle, The cow jumped over the moon
chapter 1-7 - Goosey, goosey, gander, Wither shall I wander? Upstairs and downstair And in my lady’s chamber.
chapter 1-8 - Georgie Porgie, pudding, and pie, Kissed the girls, and made them cry. When the boys came out to play, Georgie Porgie ran away.
chapter 1-9 - Cobbler, cobbler, mend my shoe. Get it done by half past two.%    (base) seyeonjo@seyeonjos-Mac
grep the chap1.txt #find "the" in the text
grep -i the chap1.txt #find "the" in the text even if "the" is uppercase!
grep -in "the" chap1.txt #n gives you line number
grep -in "the" chap1.txt chap2.txt
grep -inw "the" chap1.txt #only find whole word -> whole word means just "the" not "weather"
grep -c "the" chap1.txt #how many matches
grep "the" chap1.txt
grep -R -c "the" Documents/ #find "the" of folder -> -R -c and -Rc and -cR works same!
grep -v "the" chap1.txt
grep -vc "the" chap1.txt
grep -n "mouse" chap1.txt
grep -n B2 "mouse" chap1.txt #B2 means before 2 lines
grep -l "mouse" chap1.txt #only shows the file that has "mouse"

cut = extract the characters

cut -c1 chap1.txt #only shows first char
cut -c1,4 chap1.txt #shows first and fourth char
cut -c1-4 chap1.txt #shows first to fourth char
cat grocery.txt
apple   1  $10  sweet
banana  2  $10  sweet/tart
kiwi    4  $10  sweet
cut -f1 grocery.txt #only shows first field, so only shows fruits, not qty or price
cat grocery.txt | tr "\t" ":" > grocery.lst #tabs will be collon
cut -d":" -f1,3 grocery.lst
cut -c1-8, 15-30 grocery.lst #shows chars range (1-8) and (15-30)

sort and reverse

sort grocery.txt #sort texts
man sort #get info about sort
sort -r grocery.txt #reverse texts
cut -f3 grocery.txt | sort #shows price field in sorted array
cut -f3 grocery.txt | sort | uniq #not shows the repeated texts in sorted array
cat grocery.txt | sort -k2 #show second row(key) first

#only sweet fruit(not tart) and order by price
cat grocery.txt | grep -v "tart" | sort -k3




Back to Top

See other articles in Category Unix-Linux

Leave a comment