Just started to create my own small content scanner that searches all the visible files on my server, but now I got stuck. It should be used to scan the files for phrases like in the following example.
What I tried is the following code:
That code first finds all the files within all public_html folders that are not larger than 307200k follows with scanning the content of that files.
Now that worked fine for the first few thousand files, but now it stopped working. I thing there are to many files so that grep cant read all of them or something else. There is no error or something, the process just keeps alive but with a cpu & mem usage of 0 and that forever.
So it would be great if someone has an idea of how to write that scanner to ensure that it also works with a few hundred thousand files.
I just got the tip to use find with xargs and grep to solve that problem, but my combinations just wont works. Hopefully someone could help, because I have never tried something like that before.
Need some pro here to help me with problem, because I am still a beginner with bash.
Not clear whether you need just the filenames or their lines as well, also I'm assuming that you're using the pipe | as an alternation operator (RE) - not as a physical part of the files' records. Modify if needed.
If using GNU find/xargs ( most Linuxes ), use their -0 option to handle problematic filenames.
Thanks for the reply, I just tried your code but got some problems.
First I just tried:
there I go a lot of error messages from grep that the files or folders don't exist.
...
It works fine with me. Try testing the commands separately, first find:
( I guess you know that the way it is, find will find files and directories )
grep command on some test files ( plural ),
then all together with xargs, ( I'm guessing you're on Linux - with GNU find/xargs the right syntax is a bit different, print has to be spelled out explicitly ):
Running Debian 8.5 on a Dell Laptop
I have an Epson V39 scanner. Simple scan cannot detect it.
Here is what I have:
root@server1:/home/server1# sane-find-scanner
# sane-find-scanner will now attempt to detect your scanner. If the
# result is different from what you expected, first... (2 Replies)
Hi, somewhat of a newbie with Linux, although I have been at it for about three weeks now.
Is there a way to wake up or initialize my scanner with a command in the terminal? (6 Replies)
Hey guys..
What is the best tool that can be used on Linux for IP scanning tool that can bring ping status, hostname, and any other open service. I wish I can find a tool like "The Dude" from Mikrotik, but that works only under Windows.
Thanks (4 Replies)
epson microfilm 500 scsi:
Is there any way to make this work under linux ? I'm using pclinuxos, it shows the machine in the device panel as sg2 and lists the machine , so Im guessing the kernel knows what it is, but I can't view it as a scanner or capture or input device . What catagory does... (4 Replies)
I have a script file that file type is
ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs)
Now I want to get the contents of this file. How can I ?
Any help me to get cotents of this file type? (2 Replies)
How can I remove all data that contain domain e.g zzgh@something.com, sdd@something.com.my and gg@something.my in one file? so that i only have data without the domain in the file.
Here is the file structure "test.out"
more test.out
1 zzztop@b.com
1 zzzulll
1 zzzullll@s.com.my
... (4 Replies)
Hi
I need some help using shell script to edit a file.
My original file has the following format:
/txt/email/myemail.txt
/txt/email/myemail2.txt
/pdf/email/myemail.pdf
/pdf/email/myemail2.pdf
/doc/email/myemail.doc
/doc/email/myemail2.doc
I need to read each line. If the path is... (3 Replies)
anyone know of a FREE logfile checker that they would recommend?
looking to scan thru syslog, sulog, messages, etc... looking for security type related entries., thanks,
brian (1 Reply)