Monthly
Article
CSIMT Format |
|
This Issue
Various Topics
Tech Talk
Market Statistics
Notice:
Copyright (c) 1998 Commodity Systems Inc. (CSI). All rights are reserved.
|
11/17/1998 This document describes an adaptation of the abandoned CompuTrac® data format, which until recently was actively used by Equis' MetaStock® charting software. CSI has decided to rename the format because the Y2K extensions made it unique to CSI's proprietary use. CSI will continue updating to this format, with backward-compatible extensions to allow for updating past January 1, 2000 and through the end of the 21st century. The chat forums on the Internet present disturbing and conflicting stories about how high-profile software developers will handle the Y2K problem as it pertains to data formats. One source said that the expensive upgrade of a popular analysis program will exclude independent firms like CSI as data sources. If these rumors become reality, you may be asked to pay for something that will hurt, rather than help you in your trading efforts. We received an e-mail from Omega Research (makers of SuperCharts® and TradeStation®) saying that they will continue to support the CSI format into the year 2000. They are also considering support of the CSIMillennium format, but haven't made a decision yet. We have had unconfirmed reports that Equis (MetaStock®) plans to extend their older MetaStock format for 20 or 30 years into the next century, but no details have been supplied on how they might accomplish this and whether it will be publicly disclosed. There are many other smaller software developers who must become Y2K compliant. Should developers elect to keep their formats secret and exclude outside data firms, may have to decide between abandoning either the CSI data service or the analysis software you have already purchased. CSI has upgraded the CSI format to be Y2K compliant, and we have extended the former CompuTrac format to operate through the 21st century. Anyone contemplating a purchase of new software should insist that the program reads and writes to either the CSI format or the CSIMillennium format, the details of which are enclosed. We ask all users to pass this on to your favorite software producer and urge them to adopt the CSI formats in their new upgrades. We believe the best bet for all concerned is to use the CSIMillennium format, which is declared an open format and for which there is little danger that it will not immediately do the job. All of these issues are subject to clarification and solidification as the year 2000 approaches. If you want to assure that your database and analysis software remain compatible, let your software developer know that you wish to use CSI as your data source. This requires that analysis programs simply maintain compatibility with the CSIM and/or the CSI QuickTrieve format in addition to the other formats they support. To voice your concern contact - Omega Research:
MetaStock
CSI Millennium Format Specification Important Notice The CSIMillenniumT format and the CSIT basic format are trademarked properties of CSI. The descriptive material and the actual format structures are copyrighted © properties of CSI. All rights reserved. Both the formats and their specifications are restricted to registered users who are granted a license for their use. CSI claims a copyright on these formats to assure that no other firm will claim them and/or demand payment for their use in any way. We will not unreasonably withhold permission from those wishing to use these formats, even if the user is a competitor. No user of these formats will be paid any sum of money for developing CSI-compatible programs. . An email with the password necessary for download will be immediately forwarded to your email address. Please consult the CSI's website for the most current details before working
with this open-to-the-public format.
Current Specification To accurately access the data files within a given directory, the
programmer must read that same directory's master file list, which uniquely
identifies the specific market data files (time series) stored in that
directory. This master file list is named MASTER, and is comprised of up
to 256 records, with each record being 53 bytes in length. The fields are
formatted as follows:
MASTER FILE RECORD LAYOUT (MASTER) Record 1: DESCRIPTION Position Length Format
The "Last Entry Used" field is accessed in order to assign the next file number to a .DAT/.DOP file combination. At file creation, this field is initialized to zero, which indicates the first file to create will be F1.DAT. This field can be ignored for programs that only need to read the data files. Special NOTE: Even though the "Number of Entries" field is two bytes in length, the stored file number is only one byte; therefore the maximum file number cannot exceed 255. If the last entry used has the value 255, and the number of entries is less than 255, then you must scan the master file list for an unused number. The pseudocode is shown below: FileNumbers() - Array of integers holding
the file numbers of the master file list
Records 2 through Number of entries+1: DESCRIPTION Position Length Format
The 17 byte symbol area is further divided as follows for usage by QuickTrieve (all ASCII characters): Description: Position Length Format
Unlike QuickTrieve, which uses commodity numbers for identification, CSI's Unfair Advantage system exclusively uses the first eight characters of the 17-character symbol area to uniquely identify stocks, futures and options. NOTES: 1.) The File Number represents the physical file number on disk for the corresponding data file. For example, if the byte is a 5, then this record corresponds to data file F5.DAT and its companion file F5.DOP. See the section entitled DESCRIPTOR FILE LAYOUT for a discussion on how the DOP files relate to the DAT files. 2.) Number of data fields in the data file. This will always be the record length divided by 4, since all data fields are 4-byte single precision floating point numbers. 3.) The century indicator byte is used to signify the century of the delivery year for Commodities and stock options. The following values may be found in this byte: 18:Delivery century is 1800's
Any other value is considered invalid and the delivery year will be assumed to fall within the 1921-2020 year period. If the delivery year is greater than 20 the century is assumed to be 1900's, and if the delivery year is less than or equal 20, the century is assumed to be 2000's. Examples: delivery year of 15=2015, delivery year of 21=1921. 4.) Type Flag: @=Non-option stock or commodity, 1=Commodity Option, 2=Stock Option. If there is no number at position 10-12 (or if the number is zero), the item is a stock, otherwise it is a commodity. 5.) For stocks, the symbol field is the CSI symbol. For commodities, the symbol field is the first two characters of the CSI symbol (the third character of the CSI commodity symbol is stored at position 9), followed by the two digit delivery month, followed by the last two characters of the delivery year. The Delivery Month/Delivery Year combination must be stored at two different places within the master file record, including here in the symbol field. They are placed here as well as at position 19-23 because MetaStock requires a unique symbol for each data file, and because MetaStock would not otherwise display these important contract identifiers in selection screens. 6.) Conversion Factor codes: -4=Q -3=P -2=O -1=N 0=K 1=J 2=I 3=H 4=G 5=F 7.) Should the CSI commodity inventory ever exceed 999, please consult the CSI website for updated information 8.) Delivery Month Code for Options: A-L = Delivery month 1-12 for CALLS M-X=Delivery Month 1-12 for PUTS. 9.) Users of this format should regularly consult the CSI website
and this document for changes and announcements concerning the CSIM format.
DATA FILE RECORD LAYOUT Data is formatted on disk in a variable length record with all information in binary format. The filename is determined by the File Number field of the master file entry, e.g. if the file number field contains a binary five, the physical data file name on disk is F5.DAT and the descriptor file is F5.DOP. The record length is set by the Record Length field of the master file entry. NOTE FOR METASTOCK COMPATIBILITY: MetaStock versions prior to
version 6.5 restrict the flexibility inherent in the format by forcing a
special case data file of length 28 bytes (7 fields).
Header Record (record 1 of the data file):
Data Records (Records 2-Last posted record)
DESCRIPTOR FILE LAYOUT The descriptor (.DOP) is a sequential (carriage return/linefeed delimited) file holding the names of all data fields present for a particular data file. The number of records in this file is determined by the Number of Fields entry of the master file record. Each record of the sequential file is of the format: "FieldName",InputConversionFactor,DisplayConversionFactor An example descriptor file is shown below: "DATE",0,0
The above example is typical of most data files. The DATE,
VOL and OI price fields always have a conversion factor of 0, while the OPEN,
HIGH, LOW and CLOSE price fields have the conversion factor of the commodity
represented.
IMPORTANT NOTES ABOUT CONVERSION FACTORS: 1) When reading CSIM files you generally do not have to worry about the input conversion factor. This is because the stored numbers are all in adjusted decimal format and ready for internal calculation. This is different from the CSI QuickTrieve format, which stores all values as whole numbers and conversion to decimal must be performed before doing arithmetic calculations. The display conversion factor is used to display the scale on the chart for viewing by the end user. 2) The original CompuTrac system assumes that negative conversion
factors for raw market information are different from CSI's system of
conversion factors used in QuickTrieve and Unfair Advantage applications.
Specifically, a conversion factor of -1 for the CompuTrac format means halves,
and a conversion factor of -2 means quarters, for which the QuickTrieve format
has no equivalent. The CompuTrac conversion factor of -3 means eighths,
which is equivalent to a QuickTrieve conversion factor of -1. To
summarize: -1=halves, -2=quarters,
-3=eighths,
-4=sixteenths, -5=thirty-seconds, -6=sixty-fourths.
Also Please Note: 1.) MBF stands for Microsoft Binary Format. It is a method of storing binary numbers that has subsequently been replaced by the IEEE standard format for most computer languages. Most compilers have some type of conversion function that will convert from MBF to IEEE and back. If not, ask your CSI marketing representative for our functions available for C, Delphi and Turbo Pascal applications that will perform this numeric conversion. 2.) The first physical date and last physical date fields stored in
the master file, as well as the date field in each data record, are stored in
the following manner:
3.) Dates after December 31, 1999 are stored with a leading one to
make a seven-digit number. Examples: January 1, 2000=1000101, February
20, 2004=1040220.
PAGE 1a
|
|