Help with splitting a large text file into smaller ones
Hi Everyone,
I am using a centos 5.2 server as an sflow log collector on my network. Currently I am using inmons free sflowtool to collect the packets sent by my switches. I have a bash script running on an infinate loop to stop and start the log collection at set intervals - currently one minute.
I have written some fairly indepth analysis using bash and php to display information on the collected logs by with grep, uniq, awk / gawk, sort etc, however I would like to be able to convert this data into a mysql database to start building historic trending. The problem I have is that the log files too big for php to handle in one piece (5-15MB), while the shell is able to rip through them effortlessly.
I have attached below two example sflow datagrams, I would like split the text file into smaller files, one for each datagram.
Ideally the script would remove the datagram and the header information before the first "startSample" and insert just the corresponding "datagramSourceIP xxxx" after each "startSample". But the main thing I am having a problem with is getting all the text between the "startDatagram" and "endDatagram" into a separate file, maybe datag_00001 and so on.
If I could get this working, Im sure I can hack my way through the rest. I have attached below two (simplified) example datagrams so hopefully this will become clear.
Also, if anyone would like some help with getting sflow running please feel free to contact me.
regards,
Joe
Last edited by vgersh99; 07-15-2009 at 01:16 PM..
Reason: code tags, PLEASE!
wow thanks for a VERY quick response. Worked perfectly first time, although I needed to use gawk. I was almost certain that the solution lay with awk, but i am surprised at how elegant and concise the code is.
I will simplify the explaination a bit, I need to parse through a 87m file -
I have a single text file in the form of :
<NAME>house........
SOMETEXT
SOMETEXT
SOMETEXT
.
.
.
.
</script>
MORETEXT
MORETEXT
.
.
. (6 Replies)
Hi,
I'm trying to split a large file into several smaller files
the script will have two input arguments argument1=filename and argument2=no of files to be split.
In my large input file I have a header followed by 100009 records
The first line is a header; I want this header in all my... (9 Replies)
Hello all, newbie here. I've searched the forum and found many "how to split a text file" topics but none that are what I'm looking for.
I have a large text file (~15 MB) in size. It contains a variable number of "paragraphs" (for lack of a better word) that are each of variable length. A... (3 Replies)
I have a file with a simple list of ids. 750,000 rows. I have to break it down into multiple 50,000 row files to submit in a batch process.. Is there an easy script I could write to accomplish this task? (2 Replies)
Hello
We have a text file with 400,000 lines and need to split into multiple files each with 5000 lines ( will result in 80 files)
Got an idea of using head and tail commands to do that with a loop but looked not efficient.
Please advise the simple and yet effective way to do it.
TIA... (3 Replies)
hi all
im new to this forum..excuse me if anythng wrong.
I have a file containing 600 MB data in that. when i do parse the data in perl program im getting out of memory error.
so iam planning to split the file into smaller files and process one by one.
can any one tell me what is the code... (1 Reply)
Hello..
Iam in need to urgent help with the below.
Have data-file with 40,567
and need to split them into multiple files with smaller line-count.
Iam aware of "split" command with -l option which allows you to specify the no of lines in smaller files ,with the target file-name pattern... (1 Reply)
I have a very large (150 megs) IRC log file from 2000-2001 which I want to cut down to individual daily log files. I have a very basic knowledge of the cat, sed and grep commands. The log file is time stamped and each day in the large log file begins with a "Session Start" string like so:
... (11 Replies)