I am trying to scrape a file with content similar to this this:
"addressString":"12366 NY","eId":"64174f8e42b7fdfb837f68b","hasImage":false,"Price":5800,"Name":Bernard Bernoulli,"headline":"nice Fiat 500, red, slight damage to left mirror"
"addressString":"451 Citadel","eId":"sd3448e42b7368b","year":1976,"hasImage":true,"Price":12220,"Name":Edward Diego,"headline":"Mercedes SLX, no issues"
"addressString":"1321 Bejing","eId":"3102ffdb837fssdff3","Price":350,"Name":Jet Li,"headline":"Dodge Viper, no engine, no tires, no windshield; only cash"
I want to delete all lines with a "Price": lower than 950.
Also, the amount and position of named sections (like "Price", "year", "Name", etc.) differs from line to line.
I tried with sed and a regex which ranges from 0 to 950:
sed '/"Price":([0-9]|[1-9][0-9]|[1-8][0-9]{2}|9[0-4][0-9]|950),/d' <inputfile >outputfile
…but it did not work.
Any help is appreciated.
Using sed on Ubuntu Linux 20.04
3
Answers
That’s because you did not specify flag
-E
forsed
.In your case
sed
uses BRE, and parentheses here don’t have special meaning unless they escaped with.
You can either escape all symbols with special meaning in regex by
or use key
-E
.This almost looks like JSON data, with some small tweaks it is, i.e. you can parse the data with
jq
, e.g.:Output:
Don’t use regexps for numeric comparisons, use numeric comparisons. e.g. using GNU awk for the 3rd arg to
match()
:or using any awk: