Merge two files depending on multiple matching columnsMerging/joining a lot of csv files with numeric digits in the file namebash - merging 2 files using 2 common columns and add up the values of the 3rd columnawk - compare files and print lines from both filesawk - compare 2 files and print columns from both filesIf two columns partially match, replace third with awkHow to compare and replace the value at particular location with awkget specific lines from File1, others from File2 and print them in File3Awk-compare 2 files using multiple columns and print lines from both filescompare two lines and print unmatched words from two files

how do you value what your leisure time is worth?

What do you call the fallacy of thinking that some action A will guarantee some outcome B, when in reality B depends on multiple other conditions?

Prefill webform with civicrm activity data

Should I reveal productivity tricks to peers, or keep them to myself in order to be more productive than the others?

Does UPDATE without WHERE clause lock a table in PostgreSQL?

How long could a human survive completely without the immune system?

Why does allocating a single 2D array take longer than a loop allocating multiple 1D arrays of the same total size and shape?

Can massive damage kill you while at 0 HP?

How do lasers measure short distances (<1cm) when electronics are too slow for time-of-flight to work?

Does the Creighton Method of Natural Family Planning have a failure rate of 3.2% or less?

Would it be easier to colonise a living world or a dead world?

What is the next number in the series: 21, 21, 23, 20, 5, 25, 31, 24,?

What is /dev/null and why can't I use hx on it?

Transiting through Switzerland by coach with lots of cash

A sentient carnivorous species trying to preserve life. How could they find a new food source?

D&D Monsters and Copyright

What Supreme Court cases, other than Nixon v. United States, have directly applied or interpreted U.S. Const. Art. I, Section 3, Clause 6?

Coffee Grounds and Gritty Butter Cream Icing

Landing Hero: Product snippets VS illustrations

What if a quote contains an error

Why can I ping 10.0.0.0/8 addresses from a 192.168.1.0/24 subnet?

3x3 self-descriptive squares

Was post-exposure perspective correction ever practiced in the darkroom?

How to Skip One Bullet in ITEMIZE?



Merge two files depending on multiple matching columns


Merging/joining a lot of csv files with numeric digits in the file namebash - merging 2 files using 2 common columns and add up the values of the 3rd columnawk - compare files and print lines from both filesawk - compare 2 files and print columns from both filesIf two columns partially match, replace third with awkHow to compare and replace the value at particular location with awkget specific lines from File1, others from File2 and print them in File3Awk-compare 2 files using multiple columns and print lines from both filescompare two lines and print unmatched words from two files






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty
margin-bottom:0;









2















Im trying to join two files depending on multiple matching columns. I want to compare columns 1,2,4,5 from file 1 with columns 1,2,4,5 from file 2 and then merge matching lines in file 3 with column 3 of file 1 and all columns from files 2.
I've already tried several awk command. For example : 1) awk 'BEGINFS=OFS=","NR==FNRa[$1$2$4$5]=$3;next $1$2$4$5 in aprint $0, a[$1$2$4$5]' file2 file1 > file3 2) awk 'NR==FNR a[$1$2$4$5] = $3; next $1$2$4$5 in a' file2 file1 >file3
but i'm getting empty output. Could anyone help me with this issue ? Thank you



An example of what my files look like



File 1 :



CHROM,POS,ID,REF,ALT
1,69270,rs201219564,A,G
1,69428,rs140739101,T,G
1,69496,rs150690004,G,A
1,69511,rs75062661,A,G


File 2 :



Chr,Start,End,Ref,Alt,SIFT_score,SIFT_converted_rankscore,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_rankscore,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_rankscore,Polyphen2_HVAR_pred,PROVEAN_score,PROVEAN_converted_rankscore,PROVEAN_pred,CADD_raw,CADD_raw_rankscore,CADD_phred
1,69511,69511,A,C,0.031,0.450,D,0.002,0.090,B,0.001,0.040,B,-1.52,0.370,N,2.8350.402,21.5
1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047
1,69511,69511,A,T,1.0,0.010,T,0.0,0.026,B,0.0,0.013,B,2.06,0.004,N,-1.639,0.014,0.002
1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
1,69496,69496,G,C,0.348,0.124,T,0.988,0.604,D,0.865,0.597,P,-2.54,0.550,D,2.6520.380,20.5
1,69496,69496,G,T,0.177,0.222,T,0.999,0.764,D,0.96,0.686,D,-2.69,0.574,D,1.4030.232,12.80
1,69428,69428,T,A,0.878,0.026,T,0.981,0.577,D,0.899,0.620,P,-0.16,0.095,N,0.0240.096,2.821
1,69428,69428,T,C,0.001,0.784,D,0.995,0.657,D,0.969,0.703,D,-5.34,0.846,D,4.5330.607,24.3
1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7


Wanted Output :



ID,Chr,Start,End,Ref,Alt,SIFT_score,SIFT_converted_rankscore,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_rankscore,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_rankscore,Polyphen2_HVAR_pred,PROVEAN_score,PROVEAN_converted_rankscore,PROVEAN_pred,CADD_raw,CADD_raw_rankscore,CADD_phred
rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047









share|improve this question
































    2















    Im trying to join two files depending on multiple matching columns. I want to compare columns 1,2,4,5 from file 1 with columns 1,2,4,5 from file 2 and then merge matching lines in file 3 with column 3 of file 1 and all columns from files 2.
    I've already tried several awk command. For example : 1) awk 'BEGINFS=OFS=","NR==FNRa[$1$2$4$5]=$3;next $1$2$4$5 in aprint $0, a[$1$2$4$5]' file2 file1 > file3 2) awk 'NR==FNR a[$1$2$4$5] = $3; next $1$2$4$5 in a' file2 file1 >file3
    but i'm getting empty output. Could anyone help me with this issue ? Thank you



    An example of what my files look like



    File 1 :



    CHROM,POS,ID,REF,ALT
    1,69270,rs201219564,A,G
    1,69428,rs140739101,T,G
    1,69496,rs150690004,G,A
    1,69511,rs75062661,A,G


    File 2 :



    Chr,Start,End,Ref,Alt,SIFT_score,SIFT_converted_rankscore,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_rankscore,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_rankscore,Polyphen2_HVAR_pred,PROVEAN_score,PROVEAN_converted_rankscore,PROVEAN_pred,CADD_raw,CADD_raw_rankscore,CADD_phred
    1,69511,69511,A,C,0.031,0.450,D,0.002,0.090,B,0.001,0.040,B,-1.52,0.370,N,2.8350.402,21.5
    1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047
    1,69511,69511,A,T,1.0,0.010,T,0.0,0.026,B,0.0,0.013,B,2.06,0.004,N,-1.639,0.014,0.002
    1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
    1,69496,69496,G,C,0.348,0.124,T,0.988,0.604,D,0.865,0.597,P,-2.54,0.550,D,2.6520.380,20.5
    1,69496,69496,G,T,0.177,0.222,T,0.999,0.764,D,0.96,0.686,D,-2.69,0.574,D,1.4030.232,12.80
    1,69428,69428,T,A,0.878,0.026,T,0.981,0.577,D,0.899,0.620,P,-0.16,0.095,N,0.0240.096,2.821
    1,69428,69428,T,C,0.001,0.784,D,0.995,0.657,D,0.969,0.703,D,-5.34,0.846,D,4.5330.607,24.3
    1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7


    Wanted Output :



    ID,Chr,Start,End,Ref,Alt,SIFT_score,SIFT_converted_rankscore,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_rankscore,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_rankscore,Polyphen2_HVAR_pred,PROVEAN_score,PROVEAN_converted_rankscore,PROVEAN_pred,CADD_raw,CADD_raw_rankscore,CADD_phred
    rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
    rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
    rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047









    share|improve this question




























      2












      2








      2


      0






      Im trying to join two files depending on multiple matching columns. I want to compare columns 1,2,4,5 from file 1 with columns 1,2,4,5 from file 2 and then merge matching lines in file 3 with column 3 of file 1 and all columns from files 2.
      I've already tried several awk command. For example : 1) awk 'BEGINFS=OFS=","NR==FNRa[$1$2$4$5]=$3;next $1$2$4$5 in aprint $0, a[$1$2$4$5]' file2 file1 > file3 2) awk 'NR==FNR a[$1$2$4$5] = $3; next $1$2$4$5 in a' file2 file1 >file3
      but i'm getting empty output. Could anyone help me with this issue ? Thank you



      An example of what my files look like



      File 1 :



      CHROM,POS,ID,REF,ALT
      1,69270,rs201219564,A,G
      1,69428,rs140739101,T,G
      1,69496,rs150690004,G,A
      1,69511,rs75062661,A,G


      File 2 :



      Chr,Start,End,Ref,Alt,SIFT_score,SIFT_converted_rankscore,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_rankscore,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_rankscore,Polyphen2_HVAR_pred,PROVEAN_score,PROVEAN_converted_rankscore,PROVEAN_pred,CADD_raw,CADD_raw_rankscore,CADD_phred
      1,69511,69511,A,C,0.031,0.450,D,0.002,0.090,B,0.001,0.040,B,-1.52,0.370,N,2.8350.402,21.5
      1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047
      1,69511,69511,A,T,1.0,0.010,T,0.0,0.026,B,0.0,0.013,B,2.06,0.004,N,-1.639,0.014,0.002
      1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
      1,69496,69496,G,C,0.348,0.124,T,0.988,0.604,D,0.865,0.597,P,-2.54,0.550,D,2.6520.380,20.5
      1,69496,69496,G,T,0.177,0.222,T,0.999,0.764,D,0.96,0.686,D,-2.69,0.574,D,1.4030.232,12.80
      1,69428,69428,T,A,0.878,0.026,T,0.981,0.577,D,0.899,0.620,P,-0.16,0.095,N,0.0240.096,2.821
      1,69428,69428,T,C,0.001,0.784,D,0.995,0.657,D,0.969,0.703,D,-5.34,0.846,D,4.5330.607,24.3
      1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7


      Wanted Output :



      ID,Chr,Start,End,Ref,Alt,SIFT_score,SIFT_converted_rankscore,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_rankscore,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_rankscore,Polyphen2_HVAR_pred,PROVEAN_score,PROVEAN_converted_rankscore,PROVEAN_pred,CADD_raw,CADD_raw_rankscore,CADD_phred
      rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
      rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
      rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047









      share|improve this question
















      Im trying to join two files depending on multiple matching columns. I want to compare columns 1,2,4,5 from file 1 with columns 1,2,4,5 from file 2 and then merge matching lines in file 3 with column 3 of file 1 and all columns from files 2.
      I've already tried several awk command. For example : 1) awk 'BEGINFS=OFS=","NR==FNRa[$1$2$4$5]=$3;next $1$2$4$5 in aprint $0, a[$1$2$4$5]' file2 file1 > file3 2) awk 'NR==FNR a[$1$2$4$5] = $3; next $1$2$4$5 in a' file2 file1 >file3
      but i'm getting empty output. Could anyone help me with this issue ? Thank you



      An example of what my files look like



      File 1 :



      CHROM,POS,ID,REF,ALT
      1,69270,rs201219564,A,G
      1,69428,rs140739101,T,G
      1,69496,rs150690004,G,A
      1,69511,rs75062661,A,G


      File 2 :



      Chr,Start,End,Ref,Alt,SIFT_score,SIFT_converted_rankscore,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_rankscore,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_rankscore,Polyphen2_HVAR_pred,PROVEAN_score,PROVEAN_converted_rankscore,PROVEAN_pred,CADD_raw,CADD_raw_rankscore,CADD_phred
      1,69511,69511,A,C,0.031,0.450,D,0.002,0.090,B,0.001,0.040,B,-1.52,0.370,N,2.8350.402,21.5
      1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047
      1,69511,69511,A,T,1.0,0.010,T,0.0,0.026,B,0.0,0.013,B,2.06,0.004,N,-1.639,0.014,0.002
      1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
      1,69496,69496,G,C,0.348,0.124,T,0.988,0.604,D,0.865,0.597,P,-2.54,0.550,D,2.6520.380,20.5
      1,69496,69496,G,T,0.177,0.222,T,0.999,0.764,D,0.96,0.686,D,-2.69,0.574,D,1.4030.232,12.80
      1,69428,69428,T,A,0.878,0.026,T,0.981,0.577,D,0.899,0.620,P,-0.16,0.095,N,0.0240.096,2.821
      1,69428,69428,T,C,0.001,0.784,D,0.995,0.657,D,0.969,0.703,D,-5.34,0.846,D,4.5330.607,24.3
      1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7


      Wanted Output :



      ID,Chr,Start,End,Ref,Alt,SIFT_score,SIFT_converted_rankscore,SIFT_pred,Polyphen2_HDIV_score,Polyphen2_HDIV_rankscore,Polyphen2_HDIV_pred,Polyphen2_HVAR_score,Polyphen2_HVAR_rankscore,Polyphen2_HVAR_pred,PROVEAN_score,PROVEAN_converted_rankscore,PROVEAN_pred,CADD_raw,CADD_raw_rankscore,CADD_phred
      rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
      rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
      rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047






      awk annotation






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Apr 17 at 14:20









      user68186

      18.2k8 gold badges53 silver badges77 bronze badges




      18.2k8 gold badges53 silver badges77 bronze badges










      asked Apr 17 at 13:43









      Rokhaya BARokhaya BA

      111 bronze badge




      111 bronze badge























          1 Answer
          1






          active

          oldest

          votes


















          2
















          The most obvious thing you're missing is that your files are comma separated, but you use the default (whitespace) field separator.



          What you seem to want is



          awk -F, '
          BEGINOFS=FS
          NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next
          ind = $1 FS $2 FS $4 FS $5
          ind in a print $3, a[ind]
          ' file2 file1 > file3


          (separating the fields with FS i the associative array key string just guards against false matches; if you just concatenate fields you can't distinguish between "abcdef" and "abc""def").



          Ex.



          $ awk -F, 'BEGINOFS=FS NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next ind = $1 FS $2 FS $4 FS $5 ind in a print $3, a[ind]' file2 file1
          rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
          rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
          rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047





          share|improve this answer

























          • Thank you for your answer. Actually i did try to specify the separator but i get the same result. I also tried to delete end lines and then sorted files. But it doesnt change anything. I still get empty output.

            – Rokhaya BA
            Apr 17 at 14:40












          • @RokhayaBA do your files have DOS-style (CRLF) line endings by any chance? if so, either convert them to Unix style (with dos2unix for example) or change the awk record separator accordingly e.g. BEGINRS="rn"; OFS=FS

            – steeldriver
            Apr 17 at 15:47











          • Thank you very much. That was the problem. I was trying to delete line endings for each files first (tr 'r' 'n' < file1 > file1new) before applying awk command. But changing the awk record directly was definitely the solution.

            – Rokhaya BA
            Apr 18 at 8:27












          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "89"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/4.0/"u003ecc by-sa 4.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );














          draft saved

          draft discarded
















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1134662%2fmerge-two-files-depending-on-multiple-matching-columns%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          2
















          The most obvious thing you're missing is that your files are comma separated, but you use the default (whitespace) field separator.



          What you seem to want is



          awk -F, '
          BEGINOFS=FS
          NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next
          ind = $1 FS $2 FS $4 FS $5
          ind in a print $3, a[ind]
          ' file2 file1 > file3


          (separating the fields with FS i the associative array key string just guards against false matches; if you just concatenate fields you can't distinguish between "abcdef" and "abc""def").



          Ex.



          $ awk -F, 'BEGINOFS=FS NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next ind = $1 FS $2 FS $4 FS $5 ind in a print $3, a[ind]' file2 file1
          rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
          rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
          rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047





          share|improve this answer

























          • Thank you for your answer. Actually i did try to specify the separator but i get the same result. I also tried to delete end lines and then sorted files. But it doesnt change anything. I still get empty output.

            – Rokhaya BA
            Apr 17 at 14:40












          • @RokhayaBA do your files have DOS-style (CRLF) line endings by any chance? if so, either convert them to Unix style (with dos2unix for example) or change the awk record separator accordingly e.g. BEGINRS="rn"; OFS=FS

            – steeldriver
            Apr 17 at 15:47











          • Thank you very much. That was the problem. I was trying to delete line endings for each files first (tr 'r' 'n' < file1 > file1new) before applying awk command. But changing the awk record directly was definitely the solution.

            – Rokhaya BA
            Apr 18 at 8:27















          2
















          The most obvious thing you're missing is that your files are comma separated, but you use the default (whitespace) field separator.



          What you seem to want is



          awk -F, '
          BEGINOFS=FS
          NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next
          ind = $1 FS $2 FS $4 FS $5
          ind in a print $3, a[ind]
          ' file2 file1 > file3


          (separating the fields with FS i the associative array key string just guards against false matches; if you just concatenate fields you can't distinguish between "abcdef" and "abc""def").



          Ex.



          $ awk -F, 'BEGINOFS=FS NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next ind = $1 FS $2 FS $4 FS $5 ind in a print $3, a[ind]' file2 file1
          rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
          rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
          rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047





          share|improve this answer

























          • Thank you for your answer. Actually i did try to specify the separator but i get the same result. I also tried to delete end lines and then sorted files. But it doesnt change anything. I still get empty output.

            – Rokhaya BA
            Apr 17 at 14:40












          • @RokhayaBA do your files have DOS-style (CRLF) line endings by any chance? if so, either convert them to Unix style (with dos2unix for example) or change the awk record separator accordingly e.g. BEGINRS="rn"; OFS=FS

            – steeldriver
            Apr 17 at 15:47











          • Thank you very much. That was the problem. I was trying to delete line endings for each files first (tr 'r' 'n' < file1 > file1new) before applying awk command. But changing the awk record directly was definitely the solution.

            – Rokhaya BA
            Apr 18 at 8:27













          2














          2










          2









          The most obvious thing you're missing is that your files are comma separated, but you use the default (whitespace) field separator.



          What you seem to want is



          awk -F, '
          BEGINOFS=FS
          NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next
          ind = $1 FS $2 FS $4 FS $5
          ind in a print $3, a[ind]
          ' file2 file1 > file3


          (separating the fields with FS i the associative array key string just guards against false matches; if you just concatenate fields you can't distinguish between "abcdef" and "abc""def").



          Ex.



          $ awk -F, 'BEGINOFS=FS NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next ind = $1 FS $2 FS $4 FS $5 ind in a print $3, a[ind]' file2 file1
          rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
          rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
          rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047





          share|improve this answer













          The most obvious thing you're missing is that your files are comma separated, but you use the default (whitespace) field separator.



          What you seem to want is



          awk -F, '
          BEGINOFS=FS
          NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next
          ind = $1 FS $2 FS $4 FS $5
          ind in a print $3, a[ind]
          ' file2 file1 > file3


          (separating the fields with FS i the associative array key string just guards against false matches; if you just concatenate fields you can't distinguish between "abcdef" and "abc""def").



          Ex.



          $ awk -F, 'BEGINOFS=FS NR==FNRa[$1 FS $2 FS $4 FS $5] = $0; next ind = $1 FS $2 FS $4 FS $5 ind in a print $3, a[ind]' file2 file1
          rs140739101,1,69428,69428,T,G,0.013,0.538,D,0.999,0.764,D,0.984,0.745,D,-5.05,0.827,D,4.1000.550,23.7
          rs150690004,1,69496,69496,G,A,0.404,0.103,T,0.401,0.331,B,0.061,0.254,B,-1.39,0.344,N,0.9700.187,10.49
          rs75062661,1,69511,69511,A,G,0.652,0.048,T,0.0,0.026,B,0.0,0.013,B,1.54,0.007,N,-0.784,0.041,0.047






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Apr 17 at 14:13









          steeldriversteeldriver

          79.8k12 gold badges131 silver badges214 bronze badges




          79.8k12 gold badges131 silver badges214 bronze badges















          • Thank you for your answer. Actually i did try to specify the separator but i get the same result. I also tried to delete end lines and then sorted files. But it doesnt change anything. I still get empty output.

            – Rokhaya BA
            Apr 17 at 14:40












          • @RokhayaBA do your files have DOS-style (CRLF) line endings by any chance? if so, either convert them to Unix style (with dos2unix for example) or change the awk record separator accordingly e.g. BEGINRS="rn"; OFS=FS

            – steeldriver
            Apr 17 at 15:47











          • Thank you very much. That was the problem. I was trying to delete line endings for each files first (tr 'r' 'n' < file1 > file1new) before applying awk command. But changing the awk record directly was definitely the solution.

            – Rokhaya BA
            Apr 18 at 8:27

















          • Thank you for your answer. Actually i did try to specify the separator but i get the same result. I also tried to delete end lines and then sorted files. But it doesnt change anything. I still get empty output.

            – Rokhaya BA
            Apr 17 at 14:40












          • @RokhayaBA do your files have DOS-style (CRLF) line endings by any chance? if so, either convert them to Unix style (with dos2unix for example) or change the awk record separator accordingly e.g. BEGINRS="rn"; OFS=FS

            – steeldriver
            Apr 17 at 15:47











          • Thank you very much. That was the problem. I was trying to delete line endings for each files first (tr 'r' 'n' < file1 > file1new) before applying awk command. But changing the awk record directly was definitely the solution.

            – Rokhaya BA
            Apr 18 at 8:27
















          Thank you for your answer. Actually i did try to specify the separator but i get the same result. I also tried to delete end lines and then sorted files. But it doesnt change anything. I still get empty output.

          – Rokhaya BA
          Apr 17 at 14:40






          Thank you for your answer. Actually i did try to specify the separator but i get the same result. I also tried to delete end lines and then sorted files. But it doesnt change anything. I still get empty output.

          – Rokhaya BA
          Apr 17 at 14:40














          @RokhayaBA do your files have DOS-style (CRLF) line endings by any chance? if so, either convert them to Unix style (with dos2unix for example) or change the awk record separator accordingly e.g. BEGINRS="rn"; OFS=FS

          – steeldriver
          Apr 17 at 15:47





          @RokhayaBA do your files have DOS-style (CRLF) line endings by any chance? if so, either convert them to Unix style (with dos2unix for example) or change the awk record separator accordingly e.g. BEGINRS="rn"; OFS=FS

          – steeldriver
          Apr 17 at 15:47













          Thank you very much. That was the problem. I was trying to delete line endings for each files first (tr 'r' 'n' < file1 > file1new) before applying awk command. But changing the awk record directly was definitely the solution.

          – Rokhaya BA
          Apr 18 at 8:27





          Thank you very much. That was the problem. I was trying to delete line endings for each files first (tr 'r' 'n' < file1 > file1new) before applying awk command. But changing the awk record directly was definitely the solution.

          – Rokhaya BA
          Apr 18 at 8:27


















          draft saved

          draft discarded















































          Thanks for contributing an answer to Ask Ubuntu!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2faskubuntu.com%2fquestions%2f1134662%2fmerge-two-files-depending-on-multiple-matching-columns%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Tamil (spriik) Luke uk diar | Nawigatjuun

          Align equal signs while including text over equalitiesAMS align: left aligned text/math plus multicolumn alignmentMultiple alignmentsAligning equations in multiple placesNumbering and aligning an equation with multiple columnsHow to align one equation with another multline equationUsing \ in environments inside the begintabularxNumber equations and preserving alignment of equal signsHow can I align equations to the left and to the right?Double equation alignment problem within align enviromentAligned within align: Why are they right-aligned?

          Training a classifier when some of the features are unknownWhy does Gradient Boosting regression predict negative values when there are no negative y-values in my training set?How to improve an existing (trained) classifier?What is effect when I set up some self defined predisctor variables?Why Matlab neural network classification returns decimal values on prediction dataset?Fitting and transforming text data in training, testing, and validation setsHow to quantify the performance of the classifier (multi-class SVM) using the test data?How do I control for some patients providing multiple samples in my training data?Training and Test setTraining a convolutional neural network for image denoising in MatlabShouldn't an autoencoder with #(neurons in hidden layer) = #(neurons in input layer) be “perfect”?