M BUZZ CRAZE NEWS
// general

How can I cut email addresses with sed?

By Emma Martinez

I have the following emails.txt with:

;es
@pepito.com 

And a sed command to get

sed -n -r '/\w+@\w+\.\w+((\.\w+)*)?/p' emails.txt 

But, it keeps displaying email with more than one .com

I don't want these emails :

;es
@pepito.com 

I'm stuck here and I have no clue about how to get it.

2

3 Answers

With sed, you could do:

$ sed -nr '/^[^@]+@[^.]+\.com\s*$/p' file

The regex looks for one or more non-@ characters at the beginning of the line, then a @, then one or more non-. characters followed by .com and then 0 or more whitespace.


Other choices:

  • Perl

    perl -ne 'print if /^[^@]+@[^.]+\.com\s*$/' file
  • GNU grep

    grep -P '^[^@]+@[^.]+\.com\s*$' file
  • POSIX grep

    grep -E '^[^@]+@[^.]+\.com\s*$' file
  • awk

    awk '$0~/^[^@]+@[^.]+\.com\s*$/' file
12

I would use something like this :

sed -n -r '/\w+@\w+\.com$/p' emails.txt

It will retrieve every email in format

In case you need something more "universal" and not only .com but also .fr or .uk you can use :

sed -n -r '/\w+@\w+\.\w+$/p' emails.txt

This will retrieve every email in format

2

The expression ((\.\w+)*)? matches additional sequences of the form .xyz after the first domain. If you want to match only those addresses with a single domain, then you can enforce that by replacing it with $ or (more robustly) \s*$

sed -n -r '/\w+@\w+\.\w+\s*$/p' emails.txt

to require that there is nothing (except possibly whitespace) between the first domain and the end of the line.

0

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy