What does “fetching by region is not available for SAM files” mean? The 2019 Stack Overflow Developer Survey Results Are In Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)How to safely and efficiently convert subset of bam to fastq?How to you find read statistics for bam files with terminal mismatch?Pepsin digest (cleavage) does not work using RE?running a samtools command for multiple bam files from 1000 genomes projectSAM format: Does the BAM “Integer or numeric array” field no longer exist? Why?python does not quit when the input file is too bigconvert supplementary reads to primary in sam or bamI have 23andme text files and would like to convert to SAM/BAM formatRunning htseq-count over BAM filesAccess base aligned to particular reference position

What was the last x86 CPU that did not have the x87 floating-point unit built in?

Road tyres vs "Street" tyres for charity ride on MTB Tandem

Cooking pasta in a water boiler

What is this lever in Argentinian toilets?

Scientific Reports - Significant Figures

When did F become S in typeography, and why?

system() function string length limit

ELI5: Why do they say that Israel would have been the fourth country to land a spacecraft on the Moon and why do they call it low cost?

Who or what is the being for whom Being is a question for Heidegger?

Why is superheterodyning better than direct conversion?

Match Roman Numerals

Is above average number of years spent on PhD considered a red flag in future academia or industry positions?

Do warforged have souls?

How can I define good in a religion that claims no moral authority?

Python - Fishing Simulator

Take groceries in checked luggage

Wall plug outlet change

First use of “packing” as in carrying a gun

Did God make two great lights or did He make the great light two?

Why does the Event Horizon Telescope (EHT) not include telescopes from Africa, Asia or Australia?

Does Parliament need to approve the new Brexit delay to 31 October 2019?

Can the prologue be the backstory of your main character?

What's the point in a preamp?

Single author papers against my advisor's will?



What does “fetching by region is not available for SAM files” mean?



The 2019 Stack Overflow Developer Survey Results Are In
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)How to safely and efficiently convert subset of bam to fastq?How to you find read statistics for bam files with terminal mismatch?Pepsin digest (cleavage) does not work using RE?running a samtools command for multiple bam files from 1000 genomes projectSAM format: Does the BAM “Integer or numeric array” field no longer exist? Why?python does not quit when the input file is too bigconvert supplementary reads to primary in sam or bamI have 23andme text files and would like to convert to SAM/BAM formatRunning htseq-count over BAM filesAccess base aligned to particular reference position










5












$begingroup$


I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file



import pysam
samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
print(read)
samfile.close()


returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...



If you wish to reproduce my problem, here is my bam file.










share|improve this question











$endgroup$
















    5












    $begingroup$


    I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file



    import pysam
    samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
    for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
    print(read)
    samfile.close()


    returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...



    If you wish to reproduce my problem, here is my bam file.










    share|improve this question











    $endgroup$














      5












      5








      5





      $begingroup$


      I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file



      import pysam
      samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
      for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
      print(read)
      samfile.close()


      returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...



      If you wish to reproduce my problem, here is my bam file.










      share|improve this question











      $endgroup$




      I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file



      import pysam
      samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
      for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
      print(read)
      samfile.close()


      returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...



      If you wish to reproduce my problem, here is my bam file.







      python bam pysam






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 2 days ago









      terdon

      4,7752830




      4,7752830










      asked Apr 10 at 15:51









      Kamil S JaronKamil S Jaron

      2,942742




      2,942742




















          2 Answers
          2






          active

          oldest

          votes


















          5












          $begingroup$

          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            2 days ago







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            2 days ago


















          4












          $begingroup$

          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.






          share|improve this answer











          $endgroup$












          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42











          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "676"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7436%2fwhat-does-fetching-by-region-is-not-available-for-sam-files-mean%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          5












          $begingroup$

          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            2 days ago







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            2 days ago















          5












          $begingroup$

          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            2 days ago







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            2 days ago













          5












          5








          5





          $begingroup$

          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.






          share|improve this answer











          $endgroup$



          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago

























          answered Apr 10 at 20:43









          John MarshallJohn Marshall

          56826




          56826











          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            2 days ago







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            2 days ago
















          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            2 days ago







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            2 days ago















          $begingroup$
          Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
          $endgroup$
          – Kamil S Jaron
          Apr 10 at 21:52




          $begingroup$
          Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
          $endgroup$
          – Kamil S Jaron
          Apr 10 at 21:52




          1




          1




          $begingroup$
          No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
          $endgroup$
          – John Marshall
          Apr 10 at 21:58




          $begingroup$
          No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
          $endgroup$
          – John Marshall
          Apr 10 at 21:58












          $begingroup$
          I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
          $endgroup$
          – Kamil S Jaron
          2 days ago





          $begingroup$
          I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
          $endgroup$
          – Kamil S Jaron
          2 days ago





          1




          1




          $begingroup$
          @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
          $endgroup$
          – terdon
          2 days ago




          $begingroup$
          @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
          $endgroup$
          – terdon
          2 days ago











          4












          $begingroup$

          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.






          share|improve this answer











          $endgroup$












          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42















          4












          $begingroup$

          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.






          share|improve this answer











          $endgroup$












          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42













          4












          4








          4





          $begingroup$

          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.






          share|improve this answer











          $endgroup$



          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago

























          answered Apr 10 at 16:10









          terdonterdon

          4,7752830




          4,7752830











          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42
















          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42















          $begingroup$
          I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
          $endgroup$
          – Kamil S Jaron
          Apr 10 at 18:37




          $begingroup$
          I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
          $endgroup$
          – Kamil S Jaron
          Apr 10 at 18:37












          $begingroup$
          Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
          $endgroup$
          – Daniel Standage
          Apr 10 at 18:40




          $begingroup$
          Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
          $endgroup$
          – Daniel Standage
          Apr 10 at 18:40












          $begingroup$
          @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
          $endgroup$
          – terdon
          Apr 10 at 18:40




          $begingroup$
          @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
          $endgroup$
          – terdon
          Apr 10 at 18:40




          1




          1




          $begingroup$
          @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
          $endgroup$
          – Daniel Standage
          Apr 10 at 18:41




          $begingroup$
          @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
          $endgroup$
          – Daniel Standage
          Apr 10 at 18:41












          $begingroup$
          @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
          $endgroup$
          – terdon
          Apr 10 at 18:42




          $begingroup$
          @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
          $endgroup$
          – terdon
          Apr 10 at 18:42

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Bioinformatics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7436%2fwhat-does-fetching-by-region-is-not-available-for-sam-files-mean%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Tamil (spriik) Luke uk diar | Nawigatjuun

          Align equal signs while including text over equalitiesAMS align: left aligned text/math plus multicolumn alignmentMultiple alignmentsAligning equations in multiple placesNumbering and aligning an equation with multiple columnsHow to align one equation with another multline equationUsing \ in environments inside the begintabularxNumber equations and preserving alignment of equal signsHow can I align equations to the left and to the right?Double equation alignment problem within align enviromentAligned within align: Why are they right-aligned?

          Training a classifier when some of the features are unknownWhy does Gradient Boosting regression predict negative values when there are no negative y-values in my training set?How to improve an existing (trained) classifier?What is effect when I set up some self defined predisctor variables?Why Matlab neural network classification returns decimal values on prediction dataset?Fitting and transforming text data in training, testing, and validation setsHow to quantify the performance of the classifier (multi-class SVM) using the test data?How do I control for some patients providing multiple samples in my training data?Training and Test setTraining a convolutional neural network for image denoising in MatlabShouldn't an autoencoder with #(neurons in hidden layer) = #(neurons in input layer) be “perfect”?