Russell Index Member Companies

版本 1.1.0.0 (272.5 KB) 作者: Raj Sodhi
Downloads PDF file from www.Russell.com to get Russell Index member companies.
594.0 次下载
更新时间 2012/1/26

查看许可证

I was looking for a data feed to get historical options prices, when I stumbled upon http://www.trade-strategy.com/. The site is quite broken, since most of the links don't work. (Maybe the guy got picked up by some company.) From the downloads page, I found a series of Matlab snippets, one of which tries to download the member companies of various Russell indices. Probably because the web site has changed the name of the PDF file, and probably because the author did not make his .jar files available, I was unable to run his code. But I did get some neat ideas.

Using the Java external interface, in combination with a .jar file, one can greatly extend the capabilities of Matlab. From http://pdfbox.apache.org/download.html, one can download
* pdfbox-1.x.0.jar (in my case it was pdfbox-1.1.0.jar)
* fontbox-1.x.0.jar (in my case it was fontbox-1.1.0.jar)
and use these java classes and methods to strip out the text from a PDF file.

The top-level program called getRussellTickers2.m does the following.
* It goes to the www.Russell.com web site and retrieves the list of PDF files.
* It allows the user to choose which Russell index should be downloaded and parsed.
* The PDF file is downloaded using java.net.URL.
* The text is stripped out using PDFTextStripper.
* The text is cleaned up to return just a cell array of strings containing the company names and ticker symbols.
* The text is parsed for the ticker symbols as the last word of each line, and the company name comprises the rest.

Instructions:
* download 'fontbox-1.1.0.jar' and 'pdfbox-1.1.0.jar' from http://pdfbox.apache.org/download.html
(or just get the latest versions)
* place in the same directory as this .m file.
* download and install the latest Java Development Kit
* add "C:\Program Files\Java\jdk1.7.0_02\bin" to your PATH environment variable
(your JDK version will very likely be different)
* run the script file getRussellTickers2.

To get a complete list of classes, use the system command (easily done in Matlab with a "!"):
!jar tf pdfbox-1.1.0.jar

Enjoy!

Raj

引用格式

Raj Sodhi (2024). Russell Index Member Companies (https://www.mathworks.com/matlabcentral/fileexchange/28071-russell-index-member-companies), MATLAB Central File Exchange. 检索时间: .

MATLAB 版本兼容性
创建方式 R2010b
兼容任何版本
平台兼容性
Windows macOS Linux
类别
Help CenterMATLAB Answers 中查找有关 String Parsing 的更多信息

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
版本 已发布 发行说明
1.1.0.0

Somehow, it stopped working, perhaps because Matlab evolved between 2008 and 2010. Got it working again.

1.0.0.0