Extract specific characters from each linePrint certain fields of each line until a marker is encountered, then print whole lines till the end of fileExtract keyword from lineSearch for a specific word in each line and print rest of the lineUse awk/sed to remove everything but matching pattern in a specific columnunix: get characters 10 to 80 in a fileprint if next line containsExtract specific thing from each row in columnExtract specific fields from file

Where does the budget surplus of a conference go?

Isn't any conversation with the US president quid-pro-quo?

Why has no one requested the tape of the Trump/Ukraine call?

How to exit read-only mode

What is :>filename.txt Doing?

Are there any dishes that can only be cooked with a microwave?

Can't CD to Desktop anymore

Would an intelligent alien civilisation categorise EM radiation the same as us?

Two button calculator part 2

Was it possible for a message from Paris to reach London within 48 hours in 1782?

N-Dimensional Cartesian Product

Ethics: Is it ethical for a professor to conduct research using a student's ideas without giving them credit?

Are we sinners because we sin or do we sin because we are sinners?

How to write a vertically centered asterisk in LaTex in a normal text?

What is the "two-drive trick" that can read Amiga disks on a PC?

Is it a mistake to use a password that has previously been used (by anyone ever)?

What is the name of this landform?

What is the pKaH of pyrrole?

What does the Node2D transform property do?

What are the minimum element requirements for a star?

If a photon truly goes through both slits (at the same time), then why can't we detect it at both slits (at the same time)?

Using characters to delimit commands (like markdown)

What exactly is "Japanese" Salt and Pepper?

Can Counterspell be used to prevent a Mystic from using a Discipline?

Extract specific characters from each line

Print certain fields of each line until a marker is encountered, then print whole lines till the end of fileExtract keyword from lineSearch for a specific word in each line and print rest of the lineUse awk/sed to remove everything but matching pattern in a specific columnunix: get characters 10 to 80 in a fileprint if next line containsExtract specific thing from each row in columnExtract specific fields from file

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;

I have a text file, and I want extract the string from each line coming after "OS="

input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1

Output desired

OS=Arundo donax
OS=Setaria italica

Arundo donax
Setaria italica

edited Sep 6 at 9:54

Jeff Schaller♦

52k11 gold badges76 silver badges172 bronze badges

asked Sep 5 at 14:04

shahzad

494 bronze badges

Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

– oliv
Sep 5 at 14:13

i need only two words

– shahzad
Sep 5 at 14:17

3

This is a work order, not a question. No demonstrated effort.

– Peter Mortensen
Sep 6 at 8:25

add a comment
|

I have a text file, and I want extract the string from each line coming after "OS="

input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1

Output desired

OS=Arundo donax
OS=Setaria italica

Arundo donax
Setaria italica

edited Sep 6 at 9:54

Jeff Schaller♦

52k11 gold badges76 silver badges172 bronze badges

asked Sep 5 at 14:04

shahzad

494 bronze badges

Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

– oliv
Sep 5 at 14:13

i need only two words

– shahzad
Sep 5 at 14:17

3

This is a work order, not a question. No demonstrated effort.

– Peter Mortensen
Sep 6 at 8:25

add a comment
|

I have a text file, and I want extract the string from each line coming after "OS="

input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1

Output desired

OS=Arundo donax
OS=Setaria italica

Arundo donax
Setaria italica

edited Sep 6 at 9:54

Jeff Schaller♦

52k11 gold badges76 silver badges172 bronze badges

asked Sep 5 at 14:04

shahzad

494 bronze badges

I have a text file, and I want extract the string from each line coming after "OS="

input file line
A0A0A9PBI3_ARUDO Uncharacterized protein OS=Arundo donax OX=35708 PE=4 SV=1
K3Y356_SETIT ATP-dependent DNA helicase OS=Setaria italica OX=4555 PE=3 SV=1

Output desired

OS=Arundo donax
OS=Setaria italica

Arundo donax
Setaria italica

text-processing awk perl

edited Sep 6 at 9:54

Jeff Schaller♦

52k11 gold badges76 silver badges172 bronze badges

asked Sep 5 at 14:04

shahzad

494 bronze badges

edited Sep 6 at 9:54

Jeff Schaller♦

52k11 gold badges76 silver badges172 bronze badges

asked Sep 5 at 14:04

shahzad

494 bronze badges

edited Sep 6 at 9:54

Jeff Schaller♦

52k11 gold badges76 silver badges172 bronze badges

edited Sep 6 at 9:54

Jeff Schaller♦

52k11 gold badges76 silver badges172 bronze badges

edited Sep 6 at 9:54

Jeff Schaller♦

52k11 gold badges76 silver badges172 bronze badges

asked Sep 5 at 14:04

shahzad

494 bronze badges

asked Sep 5 at 14:04

shahzad

494 bronze badges

asked Sep 5 at 14:04

shahzad

494 bronze badges

Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

– oliv
Sep 5 at 14:13

i need only two words

– shahzad
Sep 5 at 14:17

3

This is a work order, not a question. No demonstrated effort.

– Peter Mortensen
Sep 6 at 8:25

add a comment
|

Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

– oliv
Sep 5 at 14:13

i need only two words

– shahzad
Sep 5 at 14:17

3

This is a work order, not a question. No demonstrated effort.

– Peter Mortensen
Sep 6 at 8:25

Are there always 2 words to print after OS= or do you want all words between OS= and OX=?

– oliv
Sep 5 at 14:13

i need only two words

– shahzad
Sep 5 at 14:17

This is a work order, not a question. No demonstrated effort.

– Peter Mortensen
Sep 6 at 8:25

add a comment
|

4 Answers
4

active

oldest

votes

Use GNU grep (or compatible) with extended regex:

grep -Eo "OS=w+ w+" file

or basic regex (you need to escape +

grep -o "OS=w+ w+" file
# or
grep -o "OS=w* w*" file

To get everything from OS= up to OX= you can use grep with perl-compatible regex (PCRE) (-P option) if available and make lookahead:

grep -Po "OS=.*(?=OX=)" file

#to also leave out "OS="
#use lookbehind
grep -Po "(?<=OS=).*(?=OX=)" file
#or Keep-out K
grep -Po "OS=K.*(?=OX=)" file

or use grep including OX= and remove it with sed afterwards:

grep -o "OS=.*( OX=)" file | sed 's/ OX=$//'

Output:

OS=Arundo donax
OS=Setaria italica

edited Sep 6 at 6:25

answered Sep 5 at 14:20

pLumo

8,90915 silver badges38 bronze badges

add a comment
|

In Perl, two non-whitespace "words":

$ perl -lne 'print $1 if /OS=(S+ S+)/' input

or everything up to OX=:

$ perl -lne 'print $1 if /OS=(.*?) OX=/' input

or everything up to the next something=:

$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input

With your sample input, they all give the same output, but the output would be different with e.g. an input like this:

ABC=something here OS=foo bar doo PE=3 OX=1234

answered Sep 5 at 14:25

ilkkachu

72k11 gold badges119 silver badges210 bronze badges

add a comment
|

A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).

sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'

The first block grabs everything up to OS=, the second block in the capture group (denoted by ()'s) matches upto the next = and can be referred to in the replacement as 1. The next substitution rids the last word which is a fragment from the next assignment.

Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.

edited Sep 5 at 15:03

answered Sep 5 at 14:54

A.Danischewski

3422 silver badges7 bronze badges

add a comment
|

awk 'print $(NF-4), $(NF-3)' file

OS=Arundo donax
OS=Setaria italica

awk -F= 'sub(/OX/,""); print $(NF-3)' file 

Arundo donax 
Setaria italica

edited Sep 7 at 17:43

answered Sep 6 at 22:58

Claes Wikner

1471 silver badge3 bronze badges

add a comment
|

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f539203%2fextract-specific-characters-from-each-line%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

Use GNU grep (or compatible) with extended regex:

grep -Eo "OS=w+ w+" file

or basic regex (you need to escape +

grep -o "OS=w+ w+" file
# or
grep -o "OS=w* w*" file

To get everything from OS= up to OX= you can use grep with perl-compatible regex (PCRE) (-P option) if available and make lookahead:

grep -Po "OS=.*(?=OX=)" file

#to also leave out "OS="
#use lookbehind
grep -Po "(?<=OS=).*(?=OX=)" file
#or Keep-out K
grep -Po "OS=K.*(?=OX=)" file

or use grep including OX= and remove it with sed afterwards:

grep -o "OS=.*( OX=)" file | sed 's/ OX=$//'

Output:

OS=Arundo donax
OS=Setaria italica

edited Sep 6 at 6:25

answered Sep 5 at 14:20

pLumo

8,90915 silver badges38 bronze badges

add a comment
|

Use GNU grep (or compatible) with extended regex:

grep -Eo "OS=w+ w+" file

or basic regex (you need to escape +

grep -o "OS=w+ w+" file
# or
grep -o "OS=w* w*" file

To get everything from OS= up to OX= you can use grep with perl-compatible regex (PCRE) (-P option) if available and make lookahead:

grep -Po "OS=.*(?=OX=)" file

#to also leave out "OS="
#use lookbehind
grep -Po "(?<=OS=).*(?=OX=)" file
#or Keep-out K
grep -Po "OS=K.*(?=OX=)" file

or use grep including OX= and remove it with sed afterwards:

grep -o "OS=.*( OX=)" file | sed 's/ OX=$//'

Output:

OS=Arundo donax
OS=Setaria italica

edited Sep 6 at 6:25

answered Sep 5 at 14:20

pLumo

8,90915 silver badges38 bronze badges

add a comment
|

Use GNU grep (or compatible) with extended regex:

grep -Eo "OS=w+ w+" file

or basic regex (you need to escape +

grep -o "OS=w+ w+" file
# or
grep -o "OS=w* w*" file

To get everything from OS= up to OX= you can use grep with perl-compatible regex (PCRE) (-P option) if available and make lookahead:

grep -Po "OS=.*(?=OX=)" file

#to also leave out "OS="
#use lookbehind
grep -Po "(?<=OS=).*(?=OX=)" file
#or Keep-out K
grep -Po "OS=K.*(?=OX=)" file

or use grep including OX= and remove it with sed afterwards:

grep -o "OS=.*( OX=)" file | sed 's/ OX=$//'

Output:

OS=Arundo donax
OS=Setaria italica

edited Sep 6 at 6:25

answered Sep 5 at 14:20

pLumo

8,90915 silver badges38 bronze badges

Use GNU grep (or compatible) with extended regex:

grep -Eo "OS=w+ w+" file

or basic regex (you need to escape +

grep -o "OS=w+ w+" file
# or
grep -o "OS=w* w*" file

To get everything from OS= up to OX= you can use grep with perl-compatible regex (PCRE) (-P option) if available and make lookahead:

grep -Po "OS=.*(?=OX=)" file

#to also leave out "OS="
#use lookbehind
grep -Po "(?<=OS=).*(?=OX=)" file
#or Keep-out K
grep -Po "OS=K.*(?=OX=)" file

or use grep including OX= and remove it with sed afterwards:

grep -o "OS=.*( OX=)" file | sed 's/ OX=$//'

Output:

OS=Arundo donax
OS=Setaria italica

edited Sep 6 at 6:25

answered Sep 5 at 14:20

pLumo

8,90915 silver badges38 bronze badges

edited Sep 6 at 6:25

answered Sep 5 at 14:20

pLumo

8,90915 silver badges38 bronze badges

answered Sep 5 at 14:20

pLumo

8,90915 silver badges38 bronze badges

answered Sep 5 at 14:20

pLumo

8,90915 silver badges38 bronze badges

add a comment
|

In Perl, two non-whitespace "words":

$ perl -lne 'print $1 if /OS=(S+ S+)/' input

or everything up to OX=:

$ perl -lne 'print $1 if /OS=(.*?) OX=/' input

or everything up to the next something=:

$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input

With your sample input, they all give the same output, but the output would be different with e.g. an input like this:

ABC=something here OS=foo bar doo PE=3 OX=1234

answered Sep 5 at 14:25

ilkkachu

72k11 gold badges119 silver badges210 bronze badges

add a comment
|

In Perl, two non-whitespace "words":

$ perl -lne 'print $1 if /OS=(S+ S+)/' input

or everything up to OX=:

$ perl -lne 'print $1 if /OS=(.*?) OX=/' input

or everything up to the next something=:

$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input

With your sample input, they all give the same output, but the output would be different with e.g. an input like this:

ABC=something here OS=foo bar doo PE=3 OX=1234

answered Sep 5 at 14:25

ilkkachu

72k11 gold badges119 silver badges210 bronze badges

add a comment
|

In Perl, two non-whitespace "words":

$ perl -lne 'print $1 if /OS=(S+ S+)/' input

or everything up to OX=:

$ perl -lne 'print $1 if /OS=(.*?) OX=/' input

or everything up to the next something=:

$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input

With your sample input, they all give the same output, but the output would be different with e.g. an input like this:

ABC=something here OS=foo bar doo PE=3 OX=1234

answered Sep 5 at 14:25

ilkkachu

72k11 gold badges119 silver badges210 bronze badges

In Perl, two non-whitespace "words":

$ perl -lne 'print $1 if /OS=(S+ S+)/' input

or everything up to OX=:

$ perl -lne 'print $1 if /OS=(.*?) OX=/' input

or everything up to the next something=:

$ perl -lne 'print $1 if /OS=(.*?) (w+)=/' input

With your sample input, they all give the same output, but the output would be different with e.g. an input like this:

ABC=something here OS=foo bar doo PE=3 OX=1234

answered Sep 5 at 14:25

ilkkachu

72k11 gold badges119 silver badges210 bronze badges

answered Sep 5 at 14:25

ilkkachu

72k11 gold badges119 silver badges210 bronze badges

answered Sep 5 at 14:25

ilkkachu

72k11 gold badges119 silver badges210 bronze badges

answered Sep 5 at 14:25

ilkkachu

72k11 gold badges119 silver badges210 bronze badges

add a comment
|

A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).

sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'

Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.

edited Sep 5 at 15:03

answered Sep 5 at 14:54

A.Danischewski

3422 silver badges7 bronze badges

add a comment
|

A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).

sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'

Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.

edited Sep 5 at 15:03

answered Sep 5 at 14:54

A.Danischewski

3422 silver badges7 bronze badges

add a comment
|

A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).

sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'

Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.

edited Sep 5 at 15:03

answered Sep 5 at 14:54

A.Danischewski

3422 silver badges7 bronze badges

A more robust way is to use sed to parse the full value until the word containing the next = is found. That way it will work on any sized value (e.g. if you have a font with one word or three words).

sed 's/.*OS=([^=]*).*/1/;s/ [^ ]*$//'

Note: the ^ in []'s is to exclude match the character in this case everything that is not an = sign.

edited Sep 5 at 15:03

answered Sep 5 at 14:54

A.Danischewski

3422 silver badges7 bronze badges

edited Sep 5 at 15:03

answered Sep 5 at 14:54

A.Danischewski

3422 silver badges7 bronze badges

answered Sep 5 at 14:54

A.Danischewski

3422 silver badges7 bronze badges

answered Sep 5 at 14:54

A.Danischewski

3422 silver badges7 bronze badges

add a comment
|

awk 'print $(NF-4), $(NF-3)' file

OS=Arundo donax
OS=Setaria italica

awk -F= 'sub(/OX/,""); print $(NF-3)' file 

Arundo donax 
Setaria italica

edited Sep 7 at 17:43

answered Sep 6 at 22:58

Claes Wikner

1471 silver badge3 bronze badges

add a comment
|

awk 'print $(NF-4), $(NF-3)' file

OS=Arundo donax
OS=Setaria italica

awk -F= 'sub(/OX/,""); print $(NF-3)' file 

Arundo donax 
Setaria italica

edited Sep 7 at 17:43

answered Sep 6 at 22:58

Claes Wikner

1471 silver badge3 bronze badges

add a comment
|

awk 'print $(NF-4), $(NF-3)' file

OS=Arundo donax
OS=Setaria italica

awk -F= 'sub(/OX/,""); print $(NF-3)' file 

Arundo donax 
Setaria italica

edited Sep 7 at 17:43

answered Sep 6 at 22:58

Claes Wikner

1471 silver badge3 bronze badges

awk 'print $(NF-4), $(NF-3)' file

OS=Arundo donax
OS=Setaria italica

awk -F= 'sub(/OX/,""); print $(NF-3)' file 

Arundo donax 
Setaria italica

edited Sep 7 at 17:43

answered Sep 6 at 22:58

Claes Wikner

1471 silver badge3 bronze badges

edited Sep 7 at 17:43

answered Sep 6 at 22:58

Claes Wikner

1471 silver badge3 bronze badges

answered Sep 6 at 22:58

Claes Wikner

1471 silver badge3 bronze badges

answered Sep 6 at 22:58

Claes Wikner

1471 silver badge3 bronze badges

add a comment
|

draft saved

draft discarded

Thanks for contributing an answer to Unix & Linux Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

YI4dBKYHDz,Js8LNmnDa3 LgxvpM8C,B vqDhG5I iIS7H5cV,x6dn30Ns7NMu5vxVHeb

搜尋此網誌

Bsrgvty

4 Answers
4

Your Answer

Post as a guest

4 Answers
4

4 Answers
4

Post as a guest

Popular posts from this blog

Tamil (spriik) Luke uk diar | Nawigatjuun

4 Answers 4

Your Answer

Sign up or log in

Post as a guest

Post as a guest

4 Answers 4

4 Answers 4

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tamil (spriik) Luke uk diar | Nawigatjuun

4 Answers
4

4 Answers
4

4 Answers
4