I have a data in the following format.
>ab:xy_a0by98-2 Movie= top gun actor= Tom Genere=Action Length=234 Credits=30 pe=1 summry=(Tom|action|234)
Top Gun is a 1986 American action drama film directed by Tony Scott, and produced by Don Simpson and Jerry Bruckheimer
>ab:xy_b0ha81-5 Movie= Thor actor= chris hemsworth Genere=Action Length=321 Credits=20 pe=0 summry=(chris|Action|321)
Thor embarks on a journey unlike anything he's ever faced a quest for inner peace
>ab:xy_c0ma65-1 Movie= Batman actor= Bale Genere=Action Length=251 Credits=30 pe=1 summry=(Bale|Action|251)
From American Psycho to Batman Begins to Vice, Christian Bale is a bonafide A-list star
But he missed out on plenty of huge roles along the way.
>ab:xy_d0fc78-2 Movie= Joker actor= Phoenix Genere=thriller Length=341 Credits=35 pe=2 summry=(phoenix|thriller|341)
Joker is a 2019 American psychological thriller film directed and produced by Todd Phillips
who co-wrote the screenplay with Scott Silver
>ab:xy_e0ra81-2 Movie= Superman actor= henry cavill Genere=Action Length=254 Credits=28 pe=1 summry=(cavill|action|254)
Henry William Dalgliesh Cavill is a British actor
He is known for his portrayal of Charles Brandon in Showtime's The Tudors
I want to extract all the entries which contain pe=1, each entiry starts with the >
symobol as follows:
>ab:xy_a0by98-2 Movie= top gun actor= Tom Genere=Action Length=234 Credits=30 pe=1 summry=(Tom|action|234)
Top Gun is a 1986 American action drama film directed by Tony Scott, and produced by Don Simpson and Jerry Bruckheimer
>ab:xy_c0ma65-1 Movie= Batman actor= Bale Genere=Action Length=251 Credits=30 pe=1 summry=(Bale|Action|251)
From American Psycho to Batman Begins to Vice, Christian Bale is a bonafide A-list star
But he missed out on plenty of huge roles along the way.
>ab:xy_e0ra81-2 Movie= Superman actor= henry cavill Genere=Action Length=254 Credits=28 pe=1 summry=(cavill|action|254)
Henry William Dalgliesh Cavill is a British actor
He is known for his portrayal of Charles Brandon in Showtime's The Tudors
and to format few values in a table as:
Name Length
ab:xy_a0by98-2 234
ab:xy_c0ma65-1 251
ab:xy_e0ra81-2 254
I tried grep "pe=1" input.txt > output.txt
. But it has extarcted only the first line not the description.
Any help appreciated…
2
Answers
This
sed
command should do the job:1st solution(With GNU
awk
): With your shown samples please try following inawk
code. Written and tested in GNUawk
. Simple explanation would be, checking if line starts with>
and havingpe=1
AND usingmatch
function to match regex\Length=([0-9]+)
and get its matched value into a capture group into array named
arr
to get needed value of Length string. If both of these conditions are TRUE then; printing 1st field followed by 1st item of array arr.2nd solution(with any
awk
): With any version ofawk
please try following code, little tweak of 1st solution.