Replacing lines matching a multi-line pattern (sed/perl/awk)
Dear Unix Forums,
I am hoping you can help me with a pattern matching problem.
What am I trying to do?
I want to replace multiple lines of a text file (that match a multi-line pattern) with a single line of text. These patterns can span several lines and do not always have the same number of line breaks in between.
Example input file
One of the simpler patterns may look like this (pseudocode; meaning that only parts of the line are relevant and that the number of line breaks can vary):
The matching text should be replaced with a string (e.g. "@MATCH").
In the end, the file should look like this:
Desired output Current solution for adjacent lines only:
Unfortunately, this does not seem to work for several line breaks (ie. when there is "gap" between the lines containing RtlInitAnsiString and memmove).
Stuff I tried that didn't match anything:
Any ideas how to get this kind of multi-line pattern matching to work? I'd prefer sed or perl, but awk is fine too
What should the output be for the input:
(which is your sample input file with line 2 removed)? Is line 1 kept in the output as is, or should lines 1 through 4 be changed to a single:
output line?
What should happen if there are more than 3 newlines between @CAL RtlAnsiStringToUnicodeString and @CAL memmove if there are no other occurrences of @CAL RtlAnsiStringToUnicodeString between them?
This User Gave Thanks to Don Cragun For This Post:
In response to Don's questions: This input file...
...should turn into this (minimal "destruction"):
If there are more than 3 newlines in between the first and second part of the pattern, nothing should happen. The replacement should only be executed as long as the "maximum gap" is not exceeded (in this case 3). So if the input file would look like this:
...the script should NOT replace the large block of "RtlAnsiStringToUnicodeString".
your python script works!
I changed the pattern to...
...so it would also match adjacent lines. Now I have to figure out how to turn this into a "one-liner" (I currently use "eval" to loop through a file containing pattern matching commands (mostly "sed")) and what each part of the expression does (up until now, my scripting endeavors were limited to rather basic stuff ).
Does anyone know how python compares to other approaches (awk, etc.) in terms of performance? The files I plan to analyze have upwards of 50,000 lines each and are matched against hundreds of single-line and multi-line patterns.
Cheers
Last edited by thefang; 02-25-2014 at 10:37 AM..
Reason: python<>perl mixup
In the awk piped to sed below I am trying to format file by removing the odd xxxx_digits and whitespace after, then move the even xxxx_digit to the line above it and add a space between them. There may be multiple lines in file but they are in the same format. The Filename_ID line is the last line... (4 Replies)
'Hi
I'm using the following code to extract the lines(and redirect them to a txt file) after the pattern match. But the output is inclusive of the line with pattern match.
Which option is to be used to exclude the line containing the pattern?
sed -n '/Conn.*User/,$p' > consumers.txt (11 Replies)
Hi,
I have a log file which has sessionids in it, each block in the log starts with a date entry, a block may be a single line or multiple lines. I need to sed (or awk) out the lines/blocks with that start with a date and include the session id.
The files are large at several Gb.
My... (3 Replies)
Hi
I know sed and awk has options to give range of line numbers, but
I need to replace pattern in specific lines
Something like
sed -e '1s,14s,26s/pattern/new pattern/' file name
Can somebody help me in this....
I am fine with see/awk/perl
Thank you in advance (9 Replies)
I am trying to find a line in a file ("Replace_Flag") and replace it with a variable which hold a multi lined file.
myVar=`cat myfile`
sed -e 's/Replace_Flag/'$myVar'/' /pathto/test.file
myfile:
cat
dog
boy
girl
mouse
house
test.file:
football
hockey
Replace_Flag
baseball
... (4 Replies)
Hi friends,
This is sed & awk type question.
I have a text file which has numbers spread all over the file. I want to sum the series of numbers whenever i find it and produce an output file with the sum. For example
###start of input text file ####
abc
def
ghi
1
2
3
4
kjld
random... (3 Replies)
Sample file:
This is line one,
this is another line,
this is the PRIMARY INDEX line
l ;
This is another line
The command should find the line with “PRIMARY INDEX” and remove the last character from the line preceding it (in this case , comma) and remove the first character from the line... (5 Replies)
I have an xml file that is stripped down to output that looks bacically like;
<!-- TABLEA header -->
<tablea>
some fields
</tablea>
<!-- TABLEB header -->
<!-- TABLEC header -->
<tablec>
some fields
</tablec>
I want to remove the header... (3 Replies)
I couldn't figure out how to use sed or any other shell to do the following. Can anyone help? Thanks.
If seeing a string (e.g., TODAY) in the line,
replace a string in the line above (e.g, replace "Raining" with "Sunny")
and replace a string in the line below (e.g., replace "Reading" with... (7 Replies)
Experts,
I am a beginner to Unix Shell Scripting
We have source as a flat file which contains CTRL+F character as the delimiter. We need to count the number of records in the file (CTRL+F) to perform file validation
Following command being used:
awk '{cnt+=gsub(//,"&")}END {print cnt}'... (4 Replies)