split a text file to what's before and after a given pattern
I'd like to be able to split a text file to 2 files, such that the 1st output will include all the lines up-to (but not including) a given pattern, if the pattern is in the file, or the whole input file if the pattern is not there. And the second file should be all the lines after the pattern or empty file.
file1.txt:
a
b
c$ split.sh file1.txt "b"
file1.txt.before:
a
file1.txt.after:
c$ split.sh file1.txt "d"
file1.txt.before:
a
b
c
file1.txt.after:I tried different sed commands, the closest I came up with is:
sed "1,/$2/!d" < $1 > $1.before
sed "1,/$2/d" < $1 > $1.afterbut this has some problems: - the before file is missing the 1st line of the input file - the before file contains the pattern
02 Answers
Use csplit for things like this.
CSPLIT(1) User Commands CSPLIT(1)
NAME csplit - split a file into sections determined by context lines
-f, --prefix=PREFIX use PREFIX instead of 'xx'
--suppress-matched suppress the lines matching PATTERNRegarding the regex part of the command:
Each PATTERN may be: INTEGER copy up to but not including specified line number /REGEXP/[OFFSET] copy up to but not including a matching line %REGEXP%[OFFSET] skip to, but not including a matching line {INTEGER} repeat the previous pattern specified number of times {*} repeat the previous pattern as many times as possible A line OFFSET is a required '+' or '-' followed by a positive integer.The command
csplit <file.txt> /<string>/ '{*}'will split <file.txt> into parts based on how often it finds <string>. The '{*}' will repeat the search and creates multiple files for each occurrence. By default the files will be named xx{number}; use the --prefix option to change that. Adding --suppress-matched will omit the search string from the files.
Here's a way using Awk:
awk -v pattern='^b' ' NR==1 {suff = ".before"} $0 ~ pattern {suff = ".after"; next} {print > FILENAME suff}
' file1.txtEx.
awk -v pattern='^b' 'NR==1 {suff = ".before"} $0 ~ pattern {suff = ".after"; next} {print > FILENAME suff}' file1.txtgiving
$ head file1.txt*
==> file1.txt <==
a
b
c
==> file1.txt.after <==
c
==> file1.txt.before <==
a More in general
"Zoraya ter Beek, age 29, just died by assisted suicide in the Netherlands. She was physically healthy, but psychologically depressed. It's an abomination that an entire society would actively facilitate, even encourage, someone ending their own life because they had no hope. Th…"