What does “fetching by region is not available for SAM files” mean? Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?How to safely and efficiently convert subset of bam to fastq?How to you find read statistics for bam files with terminal mismatch?Pepsin digest (cleavage) does not work using RE?running a samtools command for multiple bam files from 1000 genomes projectSAM format: Does the BAM “Integer or numeric array” field no longer exist? Why?python does not quit when the input file is too bigconvert supplementary reads to primary in sam or bamI have 23andme text files and would like to convert to SAM/BAM formatRunning htseq-count over BAM filesAccess base aligned to particular reference position

Estimate capacitor parameters

Why is "Captain Marvel" translated as male in Portugal?

Estimated State payment too big --> money back; + 2018 Tax Reform

Fishing simulator

What's the difference between (size_t)-1 and ~0?

If I can make up priors, why can't I make up posteriors?

Am I ethically obligated to go into work on an off day if the reason is sudden?

How do you clear the ApexPages.getMessages() collection in a test?

How to pour concrete for curved walkway to prevent cracking?

Keep going mode for require-package

How to rotate it perfectly?

How did passengers keep warm on sail ships?

How to colour the US map with Yellow, Green, Red and Blue to minimize the number of states with the colour of Green

Can a 1st-level character have an ability score above 18?

What kind of display is this?

How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time

Can the prologue be the backstory of your main character?

New Order #5: where Fibonacci and Beatty meet at Wythoff

Cold is to Refrigerator as warm is to?

Need a suitable toxic chemical for a murder plot in my novel

How to market an anarchic city as a tourism spot to people living in civilized areas?

How can I protect witches in combat who wear limited clothing?

Why use gamma over alpha radiation?

Who can trigger ship-wide alerts in Star Trek?



What does “fetching by region is not available for SAM files” mean?



Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
Announcing the arrival of Valued Associate #679: Cesar Manara
Unicorn Meta Zoo #1: Why another podcast?How to safely and efficiently convert subset of bam to fastq?How to you find read statistics for bam files with terminal mismatch?Pepsin digest (cleavage) does not work using RE?running a samtools command for multiple bam files from 1000 genomes projectSAM format: Does the BAM “Integer or numeric array” field no longer exist? Why?python does not quit when the input file is too bigconvert supplementary reads to primary in sam or bamI have 23andme text files and would like to convert to SAM/BAM formatRunning htseq-count over BAM filesAccess base aligned to particular reference position










5












$begingroup$


I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file



import pysam
samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
print(read)
samfile.close()


returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...



If you wish to reproduce my problem, here is my bam file.










share|improve this question











$endgroup$
















    5












    $begingroup$


    I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file



    import pysam
    samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
    for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
    print(read)
    samfile.close()


    returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...



    If you wish to reproduce my problem, here is my bam file.










    share|improve this question











    $endgroup$














      5












      5








      5





      $begingroup$


      I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file



      import pysam
      samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
      for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
      print(read)
      samfile.close()


      returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...



      If you wish to reproduce my problem, here is my bam file.










      share|improve this question











      $endgroup$




      I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file



      import pysam
      samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
      for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
      print(read)
      samfile.close()


      returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...



      If you wish to reproduce my problem, here is my bam file.







      python bam pysam






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Apr 11 at 10:17









      terdon

      4,7752830




      4,7752830










      asked Apr 10 at 15:51









      Kamil S JaronKamil S Jaron

      2,942742




      2,942742




















          2 Answers
          2






          active

          oldest

          votes


















          5












          $begingroup$

          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            Apr 11 at 7:05







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            Apr 11 at 8:10


















          4












          $begingroup$

          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.






          share|improve this answer











          $endgroup$












          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42











          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "676"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7436%2fwhat-does-fetching-by-region-is-not-available-for-sam-files-mean%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          5












          $begingroup$

          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            Apr 11 at 7:05







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            Apr 11 at 8:10















          5












          $begingroup$

          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.






          share|improve this answer











          $endgroup$












          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            Apr 11 at 7:05







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            Apr 11 at 8:10













          5












          5








          5





          $begingroup$

          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.






          share|improve this answer











          $endgroup$



          Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:



          $ htsfile 3_Tms_1_mapped.bam 
          3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data


          (For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)



          So the error message is accurate in this case.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Apr 11 at 6:36

























          answered Apr 10 at 20:43









          John MarshallJohn Marshall

          56826




          56826











          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            Apr 11 at 7:05







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            Apr 11 at 8:10
















          • $begingroup$
            Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 21:52






          • 1




            $begingroup$
            No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
            $endgroup$
            – John Marshall
            Apr 10 at 21:58










          • $begingroup$
            I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
            $endgroup$
            – Kamil S Jaron
            Apr 11 at 7:05







          • 1




            $begingroup$
            @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
            $endgroup$
            – terdon
            Apr 11 at 8:10















          $begingroup$
          Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
          $endgroup$
          – Kamil S Jaron
          Apr 10 at 21:52




          $begingroup$
          Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
          $endgroup$
          – Kamil S Jaron
          Apr 10 at 21:52




          1




          1




          $begingroup$
          No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
          $endgroup$
          – John Marshall
          Apr 10 at 21:58




          $begingroup$
          No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
          $endgroup$
          – John Marshall
          Apr 10 at 21:58












          $begingroup$
          I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
          $endgroup$
          – Kamil S Jaron
          Apr 11 at 7:05





          $begingroup$
          I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
          $endgroup$
          – Kamil S Jaron
          Apr 11 at 7:05





          1




          1




          $begingroup$
          @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
          $endgroup$
          – terdon
          Apr 11 at 8:10




          $begingroup$
          @KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
          $endgroup$
          – terdon
          Apr 11 at 8:10











          4












          $begingroup$

          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.






          share|improve this answer











          $endgroup$












          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42















          4












          $begingroup$

          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.






          share|improve this answer











          $endgroup$












          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42













          4












          4








          4





          $begingroup$

          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.






          share|improve this answer











          $endgroup$



          That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.




          You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):



          #!/usr/bin/env python3

          import pysam
          samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          print(read)
          samfile.close()


          The directory I ran it in had:



          $ ls 3*
          3_Tce_1_mapped.bam


          And I got the error you described:



          $ foo.py
          Traceback (most recent call last):
          File "/home/terdon/scripts/foo.py", line 5, in <module>
          for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
          File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
          ValueError: fetching by region is not available for SAM files


          However, indexing the bam file fixed it:



          $ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam 
          $
          mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
          $ samtools index 3_Tce_1_mapped.bam
          $
          ls 3*
          3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai

          $ foo.py | wc
          227 2724 16725


          So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited Apr 11 at 9:54

























          answered Apr 10 at 16:10









          terdonterdon

          4,7752830




          4,7752830











          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42
















          • $begingroup$
            I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
            $endgroup$
            – Kamil S Jaron
            Apr 10 at 18:37










          • $begingroup$
            Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:40










          • $begingroup$
            @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
            $endgroup$
            – terdon
            Apr 10 at 18:40






          • 1




            $begingroup$
            @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
            $endgroup$
            – Daniel Standage
            Apr 10 at 18:41










          • $begingroup$
            @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
            $endgroup$
            – terdon
            Apr 10 at 18:42















          $begingroup$
          I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
          $endgroup$
          – Kamil S Jaron
          Apr 10 at 18:37




          $begingroup$
          I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
          $endgroup$
          – Kamil S Jaron
          Apr 10 at 18:37












          $begingroup$
          Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
          $endgroup$
          – Daniel Standage
          Apr 10 at 18:40




          $begingroup$
          Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
          $endgroup$
          – Daniel Standage
          Apr 10 at 18:40












          $begingroup$
          @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
          $endgroup$
          – terdon
          Apr 10 at 18:40




          $begingroup$
          @KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
          $endgroup$
          – terdon
          Apr 10 at 18:40




          1




          1




          $begingroup$
          @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
          $endgroup$
          – Daniel Standage
          Apr 10 at 18:41




          $begingroup$
          @KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
          $endgroup$
          – Daniel Standage
          Apr 10 at 18:41












          $begingroup$
          @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
          $endgroup$
          – terdon
          Apr 10 at 18:42




          $begingroup$
          @DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
          $endgroup$
          – terdon
          Apr 10 at 18:42

















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Bioinformatics Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7436%2fwhat-does-fetching-by-region-is-not-available-for-sam-files-mean%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Bulk add to cart function issuecart vs. mini cart issue … rwd themeRedirect Add to cart button to cart pageAdd to cart issue - Magento 2.1The requested Payment Method is not available When creating an orderM2: reason add-to-cart might not function in production modeAdd to cart issue in some android devicesMagento 2 - custom price can not add to subtotal and grand total after add to cartAdd to cart codeIssue with my cart module on pdp and cart pages, just keeps spinningBulk price and quantity update using rest api

          БиармияSxpst500bh2ntaf! 3h2r