How can I cut email addresses with sed?
I have the following emails.txt with:
;es
@pepito.com And a sed command to get
sed -n -r '/\w+@\w+\.\w+((\.\w+)*)?/p' emails.txt But, it keeps displaying email with more than one .com
I don't want these emails :
;es
@pepito.com I'm stuck here and I have no clue about how to get it.
23 Answers
With sed, you could do:
$ sed -nr '/^[^@]+@[^.]+\.com\s*$/p' fileThe regex looks for one or more non-@ characters at the beginning of the line, then a @, then one or more non-. characters followed by .com and then 0 or more whitespace.
Other choices:
Perl
perl -ne 'print if /^[^@]+@[^.]+\.com\s*$/' fileGNU
grepgrep -P '^[^@]+@[^.]+\.com\s*$' filePOSIX
grepgrep -E '^[^@]+@[^.]+\.com\s*$' fileawkawk '$0~/^[^@]+@[^.]+\.com\s*$/' file
I would use something like this :
sed -n -r '/\w+@\w+\.com$/p' emails.txtIt will retrieve every email in format
In case you need something more "universal" and not only .com but also .fr or .uk you can use :
sed -n -r '/\w+@\w+\.\w+$/p' emails.txtThis will retrieve every email in format
2The expression ((\.\w+)*)? matches additional sequences of the form .xyz after the first domain. If you want to match only those addresses with a single domain, then you can enforce that by replacing it with $ or (more robustly) \s*$
sed -n -r '/\w+@\w+\.\w+\s*$/p' emails.txtto require that there is nothing (except possibly whitespace) between the first domain and the end of the line.
0More in general
"Zoraya ter Beek, age 29, just died by assisted suicide in the Netherlands. She was physically healthy, but psychologically depressed. It's an abomination that an entire society would actively facilitate, even encourage, someone ending their own life because they had no hope. Th…"