I am trying to extract the text between the two strings using the following regex.
(?s)Non-terminated Pods:.*?in total.R(.*)(?=Allocated resources)
This regex looks fine in regex101 but somehow does not print the pod details when used with perl
or grep -P
. Below command results in empty output.
kubectl describe node |perl -le '/(?s)Non-terminated Pods:.*?in total.R(.*)(?=Allocated resources)/m; printf "$1"'
Here is the sample input:
PodCIDRs: 10.233.65.0/24
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-1 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp8 100m (5%) 0 (0%) 70Mi (4%) 170Mi (10%) 7m4s
Allocated resources:
Question:
- how to extract the info from the above output, to look like below. What is wrong in the regex or the command that I am using?
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-1 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp8 100m (5%) 0 (0%) 70Mi (4%)
Question-2: What if I have two blocks of similar inputs. How to extract the pod details ?
Eg:
if the input is:
PodCIDRs: 10.233.65.0/24
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-1 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp8 100m (5%) 0 (0%) 70Mi (4%) 170Mi (10%) 7m4s
Allocated resources:
....some
.......random data...
PodCIDRs: 10.233.65.0/24
Non-terminated Pods: (7 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age
--------- ---- ------------ ---------- --------------- ------------- ---
default foo-1 0 (0%) 0 (0%) 0 (0%) 0 (0%) 105s
kube-system nginx-proxy-kube-worker-2 25m (1%) 0 (0%) 32M (1%) 0 (0%) 9m8s
kube-system nodelocaldns-xbjp3-2 100m (5%) 0 (0%) 70Mi (4%) 170Mi (10%) 7m4s
Allocated resources:
4
Answers
Using
gnu-grep
you can use your regex with some tweaks:K
(match reset) afterR
to remove that line from output-z
option to treat treat input and output data as sequences of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline.PS: Same regex will work with second input block as well with header line shown before each block.
Alternatively you can use any version
sed
for this job as well:With some obvious assumptions, and keeping it close to the pattern in the question:
(note modifiers on the regex in this line, which is too wide to fit on screen:
/gs
)The regex from the question works when used instead of the one in this answer (and with no
/s
modifier, as it should) on a single block of text. To work with multiple blocks the(.*)
in it need be changed to(.*?)
, so that it doesn’t match all the way to the lastAllocated...
The question doesn’t say how precisely is that regex "used with
perl
"; I can’t say what failed.Comments on the command-line program above:
The
-0777
switch makes it read the file whole into a string, available in the program in the variable$_
, to which the regex is bound by defaultThere is also the switch
-g
, an alias for-0777
, available starting with 5.36.0We still need the
-n
switch so that the program iterates over the "lines" of input (STDIN
or a file). In this case the input record separator is undefined so it’s all just one "line"The regex captures are returned since the match operator is in the list context, being assigned to the array
@pods
With your shown samples, please try following GNU
awk
code. Written and tested in GNUawk
. Simple explanation would be, settingRS
asNon-terminated Pods:.*Allocated resources:
for Input_file. Then in main program checking ifRT
is NOT NULL then usinggsub
function ofawk
to substitute(^|n)Non-terminated Pods:[^n]*n
ORnAllocated resources:n*
with NULL inRT
variable and then printing its value which will provide output as per shown samples.A possible solution could be as following for a very big files to read line by line.
Select range of lines of interest and remove the last one which is not included into desired output.
Output