You can not only split, but you can use this convenient method to filter (or preprocess) your files, for example you can filter by quality or by flags very easily (subsetting a file indexed BioMap without the need to create temporal files). In the next example I load a 2.6Gb BAM file and I save only reads which lengths are 75 and were mapped to chromosome 1.
>> bhe1 = BioMap('E:\rnaseq\humanembryonic\SRX026674\SRR065504\accepted_hits.bam')
bhe1 =
BioMap with properties:
SequenceDictionary: {1x25 cell}
Reference: [41927566x1 File indexed property]
Signature: [41927566x1 File indexed property]
Start: [41927566x1 File indexed property]
MappingQuality: [41927566x1 File indexed property]
Flag: [41927566x1 File indexed property]
MatePosition: [41927566x1 File indexed property]
Quality: [41927566x1 File indexed property]
Sequence: [41927566x1 File indexed property]
Header: [41927566x1 File indexed property]
NSeqs: 41927566
Name: ''
>> len = bhe1.getStop-bhe1.getStart+1; % using short reads mapped to transcripts
bnhe1 = getSubset(bhe1,len==75)
bnhe1 =
BioMap with properties:
SequenceDictionary: {1x25 cell}
Reference: [30558482x1 File indexed property]
Signature: [30558482x1 File indexed property]
Start: [30558482x1 File indexed property]
MappingQuality: [30558482x1 File indexed property]
Flag: [30558482x1 File indexed property]
MatePosition: [30558482x1 File indexed property]
Quality: [30558482x1 File indexed property]
Sequence: [30558482x1 File indexed property]
Header: [30558482x1 File indexed property]
NSeqs: 30558482
Name: ''
>> bnhe2 = getSubset(bnhe1,'SelectReference',1)
bnhe2 =
BioMap with properties:
SequenceDictionary: '1'
Reference: [2907478x1 File indexed property]
Signature: [2907478x1 File indexed property]
Start: [2907478x1 File indexed property]
MappingQuality: [2907478x1 File indexed property]
Flag: [2907478x1 File indexed property]
MatePosition: [2907478x1 File indexed property]
Quality: [2907478x1 File indexed property]
Sequence: [2907478x1 File indexed property]
Header: [2907478x1 File indexed property]
NSeqs: 2907478
Name: ''
>> write(bnhe2,'newbam')
>> ! ls -l E:\rnaseq\humanembryonic\SRX026674\SRR065504\new*
-rwxrwxrwa 183729053 Apr 9 20:38 E:/rnaseq/humanembryonic/SRX026674/SRR065504/newbam.bam
>> nb = BioMap('E:\rnaseq\humanembryonic\SRX026674\SRR065504\newbam.bam')
nb =
BioMap with properties:
SequenceDictionary: {1x25 cell}
Reference: [2907478x1 File indexed property]
Signature: [2907478x1 File indexed property]
Start: [2907478x1 File indexed property]
MappingQuality: [2907478x1 File indexed property]
Flag: [2907478x1 File indexed property]
MatePosition: [2907478x1 File indexed property]
Quality: [2907478x1 File indexed property]
Sequence: [2907478x1 File indexed property]
Header: [2907478x1 File indexed property]
NSeqs: 2907478
Name: ''
>> ! ls -l E:\rnaseq\humanembryonic\SRX026674\SRR065504\new*
-rwxrwxrwa 183729053 Apr 9 20:38 E:/rnaseq/humanembryonic/SRX026674/SRR065504/newbam.bam
-rwxrwxrwa 342176 Apr 9 20:39 E:/rnaseq/humanembryonic/SRX026674/SRR065504/newbam.bam.bai
-rwxrwxrwa 6621666 Apr 9 20:40 E:/rnaseq/humanembryonic/SRX026674/SRR065504/newbam.bam.linearindex
>>