The sound reading routines are all listed here. That will give you one column vector per channel. Use whatever operations on the resulting data as are appropriate for what you are trying to do.
There is no "Sound Processing Toolkit" or anything like that. Working with sounds is usually a matter of applying filters or doing fft . Some of the filters are most easily expressed with conv().