What does “fetching by region is not available for SAM files” mean? Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) Announcing the arrival of Valued Associate #679: Cesar Manara Unicorn Meta Zoo #1: Why another podcast?How to safely and efficiently convert subset of bam to fastq?How to you find read statistics for bam files with terminal mismatch?Pepsin digest (cleavage) does not work using RE?running a samtools command for multiple bam files from 1000 genomes projectSAM format: Does the BAM “Integer or numeric array” field no longer exist? Why?python does not quit when the input file is too bigconvert supplementary reads to primary in sam or bamI have 23andme text files and would like to convert to SAM/BAM formatRunning htseq-count over BAM filesAccess base aligned to particular reference position
Estimate capacitor parameters
Why is "Captain Marvel" translated as male in Portugal?
Estimated State payment too big --> money back; + 2018 Tax Reform
Fishing simulator
What's the difference between (size_t)-1 and ~0?
If I can make up priors, why can't I make up posteriors?
Am I ethically obligated to go into work on an off day if the reason is sudden?
How do you clear the ApexPages.getMessages() collection in a test?
How to pour concrete for curved walkway to prevent cracking?
Keep going mode for require-package
How to rotate it perfectly?
How did passengers keep warm on sail ships?
How to colour the US map with Yellow, Green, Red and Blue to minimize the number of states with the colour of Green
Can a 1st-level character have an ability score above 18?
What kind of display is this?
How to politely respond to generic emails requesting a PhD/job in my lab? Without wasting too much time
Can the prologue be the backstory of your main character?
New Order #5: where Fibonacci and Beatty meet at Wythoff
Cold is to Refrigerator as warm is to?
Need a suitable toxic chemical for a murder plot in my novel
How to market an anarchic city as a tourism spot to people living in civilized areas?
How can I protect witches in combat who wear limited clothing?
Why use gamma over alpha radiation?
Who can trigger ship-wide alerts in Star Trek?
What does “fetching by region is not available for SAM files” mean?
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
Announcing the arrival of Valued Associate #679: Cesar Manara
Unicorn Meta Zoo #1: Why another podcast?How to safely and efficiently convert subset of bam to fastq?How to you find read statistics for bam files with terminal mismatch?Pepsin digest (cleavage) does not work using RE?running a samtools command for multiple bam files from 1000 genomes projectSAM format: Does the BAM “Integer or numeric array” field no longer exist? Why?python does not quit when the input file is too bigconvert supplementary reads to primary in sam or bamI have 23andme text files and would like to convert to SAM/BAM formatRunning htseq-count over BAM filesAccess base aligned to particular reference position
$begingroup$
I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file
import pysam
samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
print(read)
samfile.close()
returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...
If you wish to reproduce my problem, here is my bam file.
python bam pysam
$endgroup$
add a comment |
$begingroup$
I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file
import pysam
samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
print(read)
samfile.close()
returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...
If you wish to reproduce my problem, here is my bam file.
python bam pysam
$endgroup$
add a comment |
$begingroup$
I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file
import pysam
samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
print(read)
samfile.close()
returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...
If you wish to reproduce my problem, here is my bam file.
python bam pysam
$endgroup$
I am used to gzip/biopython solutions when dealing with sequencing data, but now I wish to switch to more elegant pysam. So I looked at the manual, but ran into quite bizarre troubles with the first couple of lines using my bam file
import pysam
samfile = pysam.AlignmentFile("3_Tms_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tms_b3v08_scaf000159'):
print(read)
samfile.close()
returns ValueError: fetching by region is not available for SAM files. Well, the file is bam. I tried to google the error but the only hits I found are the lines in the source code of pysam that check if the file is bam/cram or sam, so somehow pysam thinks that my bam is a sam. How can I convince it otherwise? I have also noticed that the manual is for python 2.7, that's maybe where the problem comes from...
If you wish to reproduce my problem, here is my bam file.
python bam pysam
python bam pysam
edited Apr 11 at 10:17
terdon♦
4,7752830
4,7752830
asked Apr 10 at 15:51
Kamil S JaronKamil S Jaron
2,942742
2,942742
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
$begingroup$
Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:
$ htsfile 3_Tms_1_mapped.bam
3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data
(For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)
So the error message is accurate in this case.
$endgroup$
$begingroup$
Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
$endgroup$
– Kamil S Jaron
Apr 10 at 21:52
1
$begingroup$
No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
$endgroup$
– John Marshall
Apr 10 at 21:58
$begingroup$
I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
$endgroup$
– Kamil S Jaron
Apr 11 at 7:05
1
$begingroup$
@KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
$endgroup$
– terdon♦
Apr 11 at 8:10
add a comment |
$begingroup$
That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.
You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):
#!/usr/bin/env python3
import pysam
samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
print(read)
samfile.close()
The directory I ran it in had:
$ ls 3*
3_Tce_1_mapped.bam
And I got the error you described:
$ foo.py
Traceback (most recent call last):
File "/home/terdon/scripts/foo.py", line 5, in <module>
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
ValueError: fetching by region is not available for SAM files
However, indexing the bam file fixed it:
$ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam
$ mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
$ samtools index 3_Tce_1_mapped.bam
$ ls 3*
3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai
$ foo.py | wc
227 2724 16725
So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.
$endgroup$
$begingroup$
I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
$endgroup$
– Kamil S Jaron
Apr 10 at 18:37
$begingroup$
Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
$endgroup$
– Daniel Standage
Apr 10 at 18:40
$begingroup$
@KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
$endgroup$
– terdon♦
Apr 10 at 18:40
1
$begingroup$
@KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
$endgroup$
– Daniel Standage
Apr 10 at 18:41
$begingroup$
@DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
$endgroup$
– terdon♦
Apr 10 at 18:42
|
show 4 more comments
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "676"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7436%2fwhat-does-fetching-by-region-is-not-available-for-sam-files-mean%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:
$ htsfile 3_Tms_1_mapped.bam
3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data
(For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)
So the error message is accurate in this case.
$endgroup$
$begingroup$
Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
$endgroup$
– Kamil S Jaron
Apr 10 at 21:52
1
$begingroup$
No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
$endgroup$
– John Marshall
Apr 10 at 21:58
$begingroup$
I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
$endgroup$
– Kamil S Jaron
Apr 11 at 7:05
1
$begingroup$
@KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
$endgroup$
– terdon♦
Apr 11 at 8:10
add a comment |
$begingroup$
Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:
$ htsfile 3_Tms_1_mapped.bam
3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data
(For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)
So the error message is accurate in this case.
$endgroup$
$begingroup$
Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
$endgroup$
– Kamil S Jaron
Apr 10 at 21:52
1
$begingroup$
No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
$endgroup$
– John Marshall
Apr 10 at 21:58
$begingroup$
I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
$endgroup$
– Kamil S Jaron
Apr 11 at 7:05
1
$begingroup$
@KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
$endgroup$
– terdon♦
Apr 11 at 8:10
add a comment |
$begingroup$
Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:
$ htsfile 3_Tms_1_mapped.bam
3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data
(For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)
So the error message is accurate in this case.
$endgroup$
Your 3_Tms_1_mapped.bam file, despite its filename extension, is in fact a bgzipped SAM file. You can verify this using htsfile, which is a small utility packaged with HTSlib:
$ htsfile 3_Tms_1_mapped.bam
3_Tms_1_mapped.bam: SAM version 1.3 BGZF-compressed sequence data
(For files that really are in BAM format, it reports BAM version 1 compressed sequence data.)
So the error message is accurate in this case.
edited Apr 11 at 6:36
answered Apr 10 at 20:43
John MarshallJohn Marshall
56826
56826
$begingroup$
Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
$endgroup$
– Kamil S Jaron
Apr 10 at 21:52
1
$begingroup$
No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
$endgroup$
– John Marshall
Apr 10 at 21:58
$begingroup$
I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
$endgroup$
– Kamil S Jaron
Apr 11 at 7:05
1
$begingroup$
@KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
$endgroup$
– terdon♦
Apr 11 at 8:10
add a comment |
$begingroup$
Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
$endgroup$
– Kamil S Jaron
Apr 10 at 21:52
1
$begingroup$
No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
$endgroup$
– John Marshall
Apr 10 at 21:58
$begingroup$
I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
$endgroup$
– Kamil S Jaron
Apr 11 at 7:05
1
$begingroup$
@KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
$endgroup$
– terdon♦
Apr 11 at 8:10
$begingroup$
Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
$endgroup$
– Kamil S Jaron
Apr 10 at 21:52
$begingroup$
Hmm, does it mean that this link is wrong? biopython.org/DIST/docs/api/Bio.bgzf-module.html I sort of thought that it's the same thing...
$endgroup$
– Kamil S Jaron
Apr 10 at 21:52
1
1
$begingroup$
No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
$endgroup$
– John Marshall
Apr 10 at 21:58
$begingroup$
No. (Was there a particular part of that you think might be wrong?) But there is a difference between BGZF-compressing the plain text format, and a BAM file (whose decompressed underlying stream is a bespoke binary format that is different from the plain SAM text).
$endgroup$
– John Marshall
Apr 10 at 21:58
$begingroup$
I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
$endgroup$
– Kamil S Jaron
Apr 11 at 7:05
$begingroup$
I think I get it now. I got confused by the aims to this module, they write "In addition to being required for random access to and writing of BAM files", so I got an impression that I can simply use this module to write BAM files. This was actually really helpful answer, thanks!
$endgroup$
– Kamil S Jaron
Apr 11 at 7:05
1
1
$begingroup$
@KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
$endgroup$
– terdon♦
Apr 11 at 8:10
$begingroup$
@KamilSJaron you might want to consider accepting this answer instead of mine since this was the real issue.
$endgroup$
– terdon♦
Apr 11 at 8:10
add a comment |
$begingroup$
That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.
You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):
#!/usr/bin/env python3
import pysam
samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
print(read)
samfile.close()
The directory I ran it in had:
$ ls 3*
3_Tce_1_mapped.bam
And I got the error you described:
$ foo.py
Traceback (most recent call last):
File "/home/terdon/scripts/foo.py", line 5, in <module>
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
ValueError: fetching by region is not available for SAM files
However, indexing the bam file fixed it:
$ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam
$ mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
$ samtools index 3_Tce_1_mapped.bam
$ ls 3*
3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai
$ foo.py | wc
227 2724 16725
So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.
$endgroup$
$begingroup$
I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
$endgroup$
– Kamil S Jaron
Apr 10 at 18:37
$begingroup$
Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
$endgroup$
– Daniel Standage
Apr 10 at 18:40
$begingroup$
@KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
$endgroup$
– terdon♦
Apr 10 at 18:40
1
$begingroup$
@KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
$endgroup$
– Daniel Standage
Apr 10 at 18:41
$begingroup$
@DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
$endgroup$
– terdon♦
Apr 10 at 18:42
|
show 4 more comments
$begingroup$
That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.
You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):
#!/usr/bin/env python3
import pysam
samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
print(read)
samfile.close()
The directory I ran it in had:
$ ls 3*
3_Tce_1_mapped.bam
And I got the error you described:
$ foo.py
Traceback (most recent call last):
File "/home/terdon/scripts/foo.py", line 5, in <module>
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
ValueError: fetching by region is not available for SAM files
However, indexing the bam file fixed it:
$ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam
$ mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
$ samtools index 3_Tce_1_mapped.bam
$ ls 3*
3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai
$ foo.py | wc
227 2724 16725
So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.
$endgroup$
$begingroup$
I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
$endgroup$
– Kamil S Jaron
Apr 10 at 18:37
$begingroup$
Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
$endgroup$
– Daniel Standage
Apr 10 at 18:40
$begingroup$
@KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
$endgroup$
– terdon♦
Apr 10 at 18:40
1
$begingroup$
@KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
$endgroup$
– Daniel Standage
Apr 10 at 18:41
$begingroup$
@DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
$endgroup$
– terdon♦
Apr 10 at 18:42
|
show 4 more comments
$begingroup$
That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.
You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):
#!/usr/bin/env python3
import pysam
samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
print(read)
samfile.close()
The directory I ran it in had:
$ ls 3*
3_Tce_1_mapped.bam
And I got the error you described:
$ foo.py
Traceback (most recent call last):
File "/home/terdon/scripts/foo.py", line 5, in <module>
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
ValueError: fetching by region is not available for SAM files
However, indexing the bam file fixed it:
$ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam
$ mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
$ samtools index 3_Tce_1_mapped.bam
$ ls 3*
3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai
$ foo.py | wc
227 2724 16725
So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.
$endgroup$
That isn't actually a bam file as John Marshall figured out. I am keeping the rest of my answer since it could be useful to someone else, but the issue here was that you had a compressed (bgzipped) sam file and not an actual bam file and that's why you were getting that error. When I sorted your file in preparation for indexing it, I converted to a bam which is why the rest of my answer worked.
You don't have the index file for the bam file you're using. I used this script on the file you linked to (changing the name so that it corresponds to the right file and a contig in that file):
#!/usr/bin/env python3
import pysam
samfile = pysam.AlignmentFile("3_Tce_1_mapped.bam", "rb")
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
print(read)
samfile.close()
The directory I ran it in had:
$ ls 3*
3_Tce_1_mapped.bam
And I got the error you described:
$ foo.py
Traceback (most recent call last):
File "/home/terdon/scripts/foo.py", line 5, in <module>
for read in samfile.fetch('3_Tce_b3v08_scaf005149'):
File "pysam/libcalignmentfile.pyx", line 1107, in pysam.libcalignmentfile.AlignmentFile.fetch
ValueError: fetching by region is not available for SAM files
However, indexing the bam file fixed it:
$ samtools sort 3_Tce_1_mapped.bam > 3_Tce_1_mapped.sorted.bam
$ mv 3_Tce_1_mapped.sorted.bam 3_Tce_1_mapped.bam
$ samtools index 3_Tce_1_mapped.bam
$ ls 3*
3_Tce_1_mapped.bam 3_Tce_1_mapped.bam.bai
$ foo.py | wc
227 2724 16725
So just sort and index your files before attempting to seek in them. Which makes sense since the index's job is primarily to allow seeking.
edited Apr 11 at 9:54
answered Apr 10 at 16:10
terdon♦terdon
4,7752830
4,7752830
$begingroup$
I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
$endgroup$
– Kamil S Jaron
Apr 10 at 18:37
$begingroup$
Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
$endgroup$
– Daniel Standage
Apr 10 at 18:40
$begingroup$
@KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
$endgroup$
– terdon♦
Apr 10 at 18:40
1
$begingroup$
@KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
$endgroup$
– Daniel Standage
Apr 10 at 18:41
$begingroup$
@DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
$endgroup$
– terdon♦
Apr 10 at 18:42
|
show 4 more comments
$begingroup$
I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
$endgroup$
– Kamil S Jaron
Apr 10 at 18:37
$begingroup$
Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.
$endgroup$
– Daniel Standage
Apr 10 at 18:40
$begingroup$
@KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
$endgroup$
– terdon♦
Apr 10 at 18:40
1
$begingroup$
@KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
$endgroup$
– Daniel Standage
Apr 10 at 18:41
$begingroup$
@DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
$endgroup$
– terdon♦
Apr 10 at 18:42
$begingroup$
I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
$endgroup$
– Kamil S Jaron
Apr 10 at 18:37
$begingroup$
I was thinking about indexing, but then I thought that if that would be the problem pysam would complain about sorting/indexing, not about file being sam. Do yo think it would be worth opening them a ticket to make the error message bit more meaningful?
$endgroup$
– Kamil S Jaron
Apr 10 at 18:37
$begingroup$
Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (
for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.$endgroup$
– Daniel Standage
Apr 10 at 18:40
$begingroup$
Seeking, i.e. random access, requires an index. Note that this is not the only access pattern in pysam. You can access reads in a streaming fashion (
for read in samfile:) without sorting, and you can access read pileups base-by-base (for column in samfile.pileup('chr1')), which does require an index if I understand correctly.$endgroup$
– Daniel Standage
Apr 10 at 18:40
$begingroup$
@KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
$endgroup$
– terdon♦
Apr 10 at 18:40
$begingroup$
@KamilSJaron probably, yes. It obviously caused you some pain, and I'm sure you won't be alone. I only thought of this because I have seen similar cryptic errors and was wondering how it could seek without an index anyway.
$endgroup$
– terdon♦
Apr 10 at 18:40
1
1
$begingroup$
@KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
$endgroup$
– Daniel Standage
Apr 10 at 18:41
$begingroup$
@KamilSJaron Yes, I think it's an unhelpful error message, and it has caused me confusion in the past as well. If you open a ticket on Github I'd be happy to "pile on." :-)
$endgroup$
– Daniel Standage
Apr 10 at 18:41
$begingroup$
@DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
$endgroup$
– terdon♦
Apr 10 at 18:42
$begingroup$
@DanielStandage I am this close to suspending you for that horrible pun! I would too if I could stop chuckling for long enough!
$endgroup$
– terdon♦
Apr 10 at 18:42
|
show 4 more comments
Thanks for contributing an answer to Bioinformatics Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fbioinformatics.stackexchange.com%2fquestions%2f7436%2fwhat-does-fetching-by-region-is-not-available-for-sam-files-mean%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown